Affinities

Yesterday and today I've been busy writing code to implement a model and method describing an affinity between two sets of seed pixels. Here's the basic idea:

Two seeds out there, s1 and s2, have somehow produced a list of pixel locations: s1 = ?{(x1,y1),(x2,y2), . . . , (xN,yN)} and s2 = {(u1,v1),(u2,v2), . . . (uM,vM)}
. We don't use color or shape, just pixel locations: s1 and s2 are silhouettes.
Our model states that these two data are explained by an affinity A. An affinity is a restricted kind of transformation on pixels. We postulate that A (which is unknown) will make s1 look like s2. In other words, A(s1) = s2 + noise.
Taking a generative, Bayesian approach, we want a statistical model p(A,s1,s2) = (k)L(s1,s2; A) p(A). L is a function called the likelihood -- in this case, the likelihood of the seed data, presupposing we know A (which we don't). p(A) represents the prior plausibility of some A. Finally, k is constant of proportionality that we ignore. We take this approach because if we try a sufficient bunch of candidate affinities A1, A2, A3, . . ., plugging each one into p(A,s1,s2), we often can find an affinity A' that yields a relatively high value for p(A',s1,s2). That best answer A' can be said to "explain" the data; it's as if we've developed an "understanding" of the data.
The above story has a few blank sections still: the likelihood L, the prior p(A), and the proposal process for A1, A2, etc.
For the likelhood, let's start from a nice big constant number and deduct 1 every time there's a pixel in s2 that is absent in A(s1), or vice versa. So if all their pixels align, there will be a high score (no deductions).
For the prior p(A), I've decomposed A into seven more-or-less independent factors (discussed earlier) -- A might magnify or shrink in 2D, but not by a lot. "Not by a lot" sounds vague, but we can be more quantitative using Gaussians -- say, along an axis, by a factor that on average is 1 and 68% of the time is no more than, say, 40% above or below unity. Where did I get that? I made it up, but I'm trying to draw from empirical observations of seeds, and I think I can be a bit sloppy in my guessing. Likewise I can roughly model my prior knowledge of seed variation by casting it in terms of shear, translation, and rotation. Rotation is easiest to model, because it's a "known unknown": the seed could be rotated in absolutely any direction.
One annoying detail is that if A magnifies, and if the pixels of s1 are at least 1 unit apart, then some pixels of A(s1) will be more than one unit apart! This could make s1 look like stripes or splinters! So I wrote code to augment s1 with pseudopixels between the actual pixels, which will fill in the gaps and prevent splintering.
The last major piece is the proposal process. We will use an approach in the large family of Markov-chain Monte Carlo methods, i.e., its proposals are a mixture of memory and randomness. A Markov chain partially retains what else has been tried recently, and how well it worked, but then proposes a stochastic increment from that state -- it can be described as an "exploration" of a space. In this case, a seven-dimensional space.
As is always the case with real code implementations, there are other chores: developing a "language" for representing seeds in a text file, writing code to interpret and produce that language, writing test code to make sure the individual components actually do what I intended, and writing code to display what is happening in text or graphics.