Something is wrong with the scripts that generate the database -- some values are missing from the goldStandardDateCRF key field. It's because some records exist solely to hold immunity data, which doesn't have an official gold standard date column. Can probably ignore for now.
For now, use pooled scores for each of 5 databses:
sri3.csv sritot
das4.csv dastot
ea4.csv easumtot
fact4.csv facttot
pss4.csv psstot
The datasets were originally named here.
Read data, isolated columns of interest, and concatenated each subjects' visits into long rows, then write to csv file for processing:
fire_data_path = '../../data/'
[db, db_meta] = fire_build_self_report_database(fire_data_path);
[cluster_db, cluster_db_meta] = fire_gather_columns('../../experiments/2014_04_10_cluster_3/columns.txt', db_meta, db);
wide_db = e_2014_03_merge_visits(cluster_db)
fire_write_csv(wide_db(:,2:end), [], 'test.csv')
EXPERIMENT PATH: experiments/2014_04_10_cluster_3
README file in experiment path describe how to prepare data and run experiment.
Result: 3-cluster model, with apparently very good separation. reulsts are in $EXPERIMENT_PATH/out/cluster_results.txt
Posted by Kyle Simek