[Work Log] Implementing Nystrom method; bugs in posterior sampling code

October 28, 2013
Project Tulips
Subproject Data Association v3
Working path projects/​tulips/​trunk/​src/​matlab/​data_association_3
Unless otherwise noted, all filesystem paths are relative to the "Working path" named above.

Re-ran with method 2. Still getting negative eigenvalues.

Possibly non-symmetry issue.

Change gears -- work on fast implementation, then resume debugging.

Reduced rank approximation of data-covariance

Goal: approximate \(\left(K + \Sigma_D \right)^{-1}\), using a low-rank approximation of K; where \(\Sigma_D\) is the likelihood covariance.

In our case \(\Sigma_D\) has infinite covariance, so this inverse doesn't exist. However, if we work with precision matrix \(\Lambda = \Sigma_D^{-1}\), and use the decomposition \( S' S = \Lambda \) we can replace this expression with the equivalent \[ S' \left( S K S' + I \right)^{-1} S \] Even though S is rank-deficient, the inverse in this expression does exist, thanks to the finite positive values being added to the diagonal.

We approximate the above expression as follows. Let \(K\) be an \(N\) by \(N\) matrix. We can take the eigendecomposition \(K = V D V'\) and approximate it with \(\tilde K = \tilde V \tilde D \tilde V'\), where \(\tilde V\) and \(\tilde D\) consist of the first \(n\) eigenvalues and eigenvectors of \(K\) respectively. We can use this low-rank approximation with the [Woodbury matrix identity](http://en.wikipedia.org/wiki/Woodbury_matrix_identity) to approximate the above inverse in \(O(n^3)\) time, rather than \(O(N^3)\). The Woodbury identity is \[ (A + U C V )^{-1} = A^{-1} - A^{-1} U (C^{-1} + V A^{-1} U)^{-1} V A^{-1} \] Setting \(A = I\), \(U = V' = S \tilde V \) and \(C = \tilde D \), we get: \[ \left( S K S' + I \right)^{-1} = I - S \tilde V (\tilde D^{-1} + \tilde V' S' S \tilde V)^{-1} \tilde V' S' \]

It remains to find \(\tilde V\) and \(\tilde D\) efficiently. Naive eigenvalue decomposition takes \(O(N^3)\) time, which isn't any better than direct inversion. Sections 4.3.2 and 8.1 of Williams and Rasmussen show how to approximate \(n\) eigenfunctions and eigenvalues from \(n\) data points in \(O(n^3)\) time. Eigenvectors \(\tilde V\) arise by evaluating the approximate eigenfunctions at the appropriate indices.

Substitutuing back into the original expression by surrounding with (\(S'\) and \(S\)), we get the final expression: \[ \begin{align} \approx & S' \left ( I - S \tilde V (\tilde D^{-1} + \tilde V' S' S \tilde V)^{-1} \tilde V' S' \right) S \\ =&\Lambda - \Lambda \tilde V (\tilde D^{-1} + \tilde V' \Lambda \tilde V)^{-1} \tilde V' \Lambda \end{align} \]

Work log

Implemented two different implementations of "nystrom solve" (tools/nystrom_solve.m). Crude testing shows big speedup, but stalling on a big eigenvalue decomposition. Replacing with a call to "chol", waiting on visual results.

...

"chol" gives much faster results. (no need to symmetrize) But results are junk -- blank screen.


Oops, bug in call to chol; fixed.

still getting black. Posterior covariance is not positive definite; eigenvalues are negative.

When I force covariance to zero (to view the mean value), it's still in the ballpark but wonky (same as before).


Clue: When re-using (thinned) input covariance matrix and indices, result looks sensible. Clearly, we have a problem with how K_star is computed (and likelihood K_star_star, too).

Comparing different K_star's...

Looks like on-diagonal elements differ, off-diagonals are okay.

...

Got it! Wacv was trained using no-perturb model, but is being tested with OU-perturb.

TODO: edit get_all_Wacv and/or get_wacv_result to receive a model-type... done.


reducing test to block (1,1)

possible sources of disparity: indices, function

checked indices -- same.

check params -- model-type doesnt match.


First view now looks good in matlab.

Running all views in likelihood_server.

...

Means look great!. Now on to random perturbed.

...

Cholesky claims posterior covariance isn't positive definite.

Possibly a bug in construct_attachment_covariance_3?

Re-running with K_star_star from original covariance matrix. Still gives "not positive definite" errors.

Resorting to eigenvalue decomposition method.

Results look good! perturbation from mode is TINY, at least in the ground truthed WACV dataset. If this is true for hyptoheses in the full system, we can probably get away with simply taking the mean and saving significant computation of the covariacne matrix.

Tuning Nystrom factor

nystrom factor: 50

nystrom factor = 50

nystrom factor: 20

nystrom factor = 20

nystrom factor: 10

nystrom factor = 10

Nystrom factor apparently has a huge affect on reconstruction quality. Possibly better choice of thinned index set (compared to uniform sampling) will improve sensitivity, but a brief attempt at this (grouping xyz indices ) had net negative effect.

Needs more thought, possibly should resort to subset of data (SD) or subset of regressors (SR) method, or exploit markov structure.

Summary

Nystrom thinning: 10 ll2 spacing: 3 Unclear

TODO (new)

Posted by Kyle Simek
blog comments powered by Disqus