July 08, 2013

**Goal:** Cleanup and speedup of curve_ml.

Making a simpler version that doesn't exploit the matrix inversion lemma (MIL).

The benefits of the MIL are likely mitigated by markov-decomposition (next task), and MIL complicates the code and API design.

Results (small test):

```
Computing marginal likelihood (old way)...
Done. (108.2 ms)
Old ML: 62.183969
Computing marginal likelihood (new way, legacy correspondence)...
Done. (120.8 ms)
New ML (Legacy corr): 63.908784
Computing marginal likelihood (new way)...
Done. (88.9 ms)
New ML (New corr): 72.304718 <---------- OLD RESULT
Computing marginal likelihood (new way, no MIL)...
Done. (97.3 ms)
New ML (New corr, no MIL): 72.304666 <---------- NEW RESULT
```

The difference is about 0.00007% -- seems good. Slightly slower, possibly just noise.

Results (full test):

```
Computing marginal likelihood (old way)...
Done. (6447.3 ms)
Old ML: 207.324848
Computing marginal likelihood (new way, legacy correspondence)...
Done. (7665.1 ms)
New ML (Legacy corr): 209.199676
Computing marginal likelihood (new way)...
Done. (6205.6 ms)
New ML (New corr): 457529.406731 <---------- ?????
Computing marginal likelihood (new way, no MIL)...
Done. (5946.7 ms)
New ML (New corr, no MIL): 778723.327102 <---------- ?????
```

Both versions give CRAZY high results; significantly different from the same test on Friday. Need to investigate...

Observations

- All results differ from friday, including old ML (slightly)
- New ML using legacy correspondence is still reasonable.
- Both crazy results are using new correspondence.

TODO:

- Look at pieces (log posterior, likelihood, prior)
- Inspect new correspondence
- Think about what changed since Friday?

12:10:19 PM

Ran old test, `test_ml_end_to_end.m`

and compared to old results in new test, `test_ml_end_to_end_2.m`

. Old test still gives old results, so I'll compare the two tests to determine what has changed.

12:12:48 PM

The `mix`

parameter was set to 0.5 instead of 0.0. I had forgotten I changed it at the end if Friday.

Now getting "good" results again. It raises a question that needs to be answered: why is ML sooo sensitive to evalaution position when using new correspondence?

New results (full test):

```
Computing marginal likelihood (old way)...
Done. (6009.5 ms)
Old ML: 207.041742
Computing marginal likelihood (new way, legacy correspondence)...
Done. (7286.3 ms)
New ML (Legacy corr): 207.700616
Computing marginal likelihood (new way)...
Done. (6146.6 ms)
New ML (New corr): 793.585038 <---------- OLD RESULT
Computing marginal likelihood (new way, no MIL)...
Done. (6008.3 ms)
New ML (New corr, no MIL): 793.600835 <---------- NEW RESULT
```

0.002% error, slightly faster. Good.

So why is new ML so sensitive to where it is evaluated?

Note that it's only extreme when using new correspondence. Old correspondence is okay.

Inspect difference between max likelihood and max posterior. Maybe it's just more extreme than with old correspondence.

12:20:59 PM

Idea: bug in non-0.0 case? Nope.

Tried mixes: 0.0, 0.01, 0.1. Steady and dramatic increase in ML for new correspondence. Old correspondence is nearly constant. What is different between these correspodnences (other than the correspondences themselves).

12:57:08 PM

**Answer**: The max likelihood in the new correspondences is waaaaay into the tails of the prior.

Since no real triangulation is done to ensure agreement between views, the maximum likelihood triangulation is very rough (infinite likelihood variance allows this to happen without penalty). This places the curve far into the tails of the prior, where evaluation is highly unstable. Both the prior and likelihood are then extremely low (-1e-8 in log space), which should cancel, but don't due to numerical instability. Hence, marginal likelihoods with huge magnitude.

This is great news, since it means everything is working mostly as expected. The numerical instability issue is easilly solved by always evaluating at the posterior, not the maximum likelihood.

The version of the new ML that doesn't use the matrix inversion lemma is accurate, so we can proceed to the markov-decomposed version next.

Posted by Kyle Simek