Unless otherwise noted, all filesystem paths are relative to the "Working path" named above.

No work logs this week. All week I've been splitting time between research and packing for the move to Phoenix on Monday. I've made some progress on the training framework.

Miscellaneous

Due to periodic crashing of Matlab (user error!), I added a script tmp_setup_workspace.m, which is intended to restore all the needed variables for whatever task I'm currently working on.

Optimized likelihood for training

Spend some time thinking about how to design a version of the marginal likelihood computation that is optimized for training.

Some time was spent deriving the analytical ML gradient w.r.t. the training parameters. I believe the result won't be too complicated, but deriving it was taking up too much time. We'll use a numerical gradient for now; will return to analytical gradient if the numerical proves to be lacking.

To save some computation during training, I precomputed the component matrices for the prior and likelihood. Constructing the prior and likelihood covariance matrices will involve scaling and summing the component parts. These cached matrices consume about 740 MB with the current training set.

However, one unavoidable bottleneck continues to be the cholesky decomposition, which isn't improved by this precomutation. I was hoping there would be some matrix inversion tricks involving linear combinations of matrices, but my research on this came up empty. My last hope is to run Cholesky on the GPU (Matlab makes this trivial), but if that doesn't speed things up, I'll resign myself to waiting 10 hours for training.

Nevertheless, the component caching still gives a 1.5x end-to-end speedup on the no_purturb_kernel case, compared the to direct implementation.

...

After optimizing, the bottlenecks are split evenly three ways: (1) constructing Prior kernel; (2) constructing ML covariance matrix; and (3) evaluating ML pdf. The latter two are dominated mostly by \(O(n^3)\) matrix multiplication. Surprisingly, Cholesky is significant, but not dominating.

Summary:

Training-optimized Marginal Likelihood is finished; in train/tr_curves_ml.m. Results confirmed against reference implementation: train/tr_curves_ml_ref.m.

TODO

finish Training framework:

setup local minimizer with train/tr_curves_ml.m

determine if numerical gradient is okay

Sanity check: compare the ML's of each trained model on training data.