Seed affinities

Good news and bad news.  First the good news.  The affinity sampler works pretty OK for different isolated seed silhouettes.  Here are a few examples with lima beans, barley, and wheat (respectively):


   

I've also made a few adjustments to from the earlier version, so that (1) magnification in X and Y directions is correlated, and (2) likelihood and prior seem to be a little more in balance, I guess.

  1. Correlated magnification:  this change represents additional prior knowledge that hadn't occurred to me yet.  The affinity model permits magnifying the X and Y axes by different amounts, which I think is appropriate, but I had let those magnifications move completely independent of each other.  That left the door open for the sampler to project one of the seeds into a thin filament.  However, when a seed is small in one direction, doesn't that suggest it is likely to be small in the perpendicular direction?  I added a correlation, therefore, to the two magnifications (which is kind of a pain because independent variables are so much easier to work with).
  2. Balancing prior and likelihood.  This is a subject that Kobus tells me often arises in a statistical model, but is somewhat difficult to specify precisely.  In general, if and when the likelihood and prior get into a tug-of-war, we want them to be roughly matched in strength.  We don't want one to be able to overwhelm the other.  Unfortunately it seems to be something of an empirical art how to achieve this balance. It would be nice if we had a principled way to find the best balance.

To show an example of what I'm talking about, below I show the lima bean example with two alternative models.  In the first one below, the prior does not correlate the two magnifications, so the bean projection gets super-skinny.  Also, the likelihood sort of overpowers the prior, so the sampler is "terrified"* of expanding the proposal outside the bounds of the red bean.  It gets kind of stuck, therefore, and wastes a lot of time on weak proposals, and doesn't get to a plausible magnification until it's too late.  In the second example, the likelihood is underpowered, and so the sampler "barely listens" to the data; the effect is aimless wandering.
 

* hope you don't mind me anthropomorphizing a bit.

So that's the good news.  The bad news is that the model is incomplete.  Kobus has convinced me that I need to add edge information into the likelihood.  The above is working nicely on isolated seeds, but touching seeds will need more clues about how to fit a template to the data -- a lesson that Kobus and Joe Schlect discovered working with furniture images.