[Work Log] FIRE - cluster w/ missing data

May 13, 2014
Project FIRE
Subproject Piecewise Linear Clustering (tests)
Working path projects/​fire/​trunk/​src/​piecewise_linear/​test
SVN Revision unknown (see text)
Unless otherwise noted, all filesystem paths are relative to the "Working path" named above.

Testing missing data in cluster model

Run #1 - enable missing data

Segfault resulting from empty cluster. Writing routine to create a cluster from worst point.

...

Still getting weird results. Clusters are collapsing constantly.

Even a single missing value screws up results. There must be a bug in my initial estimate script

...

BUG: true/false swap when determining whether to use missing-data-enabled line fitting

...

Several bugs related to computing epsilon. Fixed after several hours :-/

...

It seems we can continue to increase the missing percentage indefinitely, without the clustering suffering (or at least until an entire observation becomes missing, which isn't handled).

Likely the small amount of noise is helping us a lot here. We'll see how it works on real data.

Real FIRE data

High-level Tasks

  1. merge radation data from Laura (into demograph dataset?)
  2. for each subject, first chemo last chemo first rad last rad
  3. write results in FIRE data format

Reading and merging radiation data:

Do same for chemo dates.

Merge chemo and rad.

compute "type"

...

All implemented in in_progress/process_treatment_dates.m.

Data consistency issues

Posted by Kyle Simek
blog comments powered by Disqus