[SciPy-user] Maximum entropy distribution for Ising model - setup?

Sat Oct 28 13:18:45 EDT 2006

Martin,

Hope this makes sense. If not, why don't we take this discussion offline.

Best,

James

1.) "I want to find a combination of the individual spins and
spin products that maximizes the entropy"

No, you want to find a *probability distribution* of spins that 
maximizes the entropy. Remember, the entropy of a random variable is 
defined for a distribution of the variable (e.g. configuration of 
spins), not for a particular value of the variable.

2.) "I thought the whole magic of running maximum entropy is not just that
you end up with a set of hi and Jij that give you a probability
distribution that matches the expectation values that you asked for (to
within some tolerance), but that you also choose those hi and Jij such
that the distribution's entropy is maximized"

True -- see the summary below.

3.) "I think many of the Jij will be quite small, but I want that to 
come out of the model. I
don't have any way to justify setting some of them to zero ahead of
time."

No problem, as long as your empirical values are measured accurately 
enough (and assuming the max.entr. framework makes sense for your 
application).

4.) Maxent summary:

Case 1. A single scalar empirical statistic

Given a variable x (e.g. N spin states) and empirical stastic 
<f(x)>_{emp}, the max. ent. distr. whose statistic matches the empirical 
statistic is:

P(x) = e^{a*f(x)} / Z(a)

where a is chosen such that <f(x)> = <f(x)>_{emp}. Entropy is defined as 
H = -\sum_x P(x) log P(x).

Proof (sketch): Use lagrange multiplier to maximize entropy subject to 
constraint <f(x)> = <f(x)>_{emp}:

E = H - a(<f(x)> - <f(x)>_{emp}) where a is the lagrange multiplier.

Maximize E wrt P(x) and a, get P(x) = const * e^{a*f(x)}, where we call 
const Z(a) since it depends on a, and where a is chosen to satisfy 
<f(x)> = <f(x)>_{emp}.

Case 2. More than one scalar empirical statistic

Now f(x) can be a scalar or vector function (one component for each 
empirically observed scalar value) of x. In your spin example, it is a 
vector with many components: f(s)=(s_1, ... s_N; s_12, ... s_1N, s_22, 
... s_2N, s_N1, ... s_NN), in which case the lagrange multipliers are 
the h_i and J_ij 's.