entropy
John Hunter
jdhunter at ace.bsd.uchicago.edu
Mon Mar 15 12:44:51 EST 2004
I am trying to compute the entropy of a time series (eg,
http://en.wikipedia.org/wiki/Information_theory) using
S = - sum p_i log2(p_i)
According to the text I am using, the entropy of a gaussian
distribution should be
1/2 log2(2 pi e sigma^2)
so I am using this result to test my algorithm. Unfortunately, I am
not getting the results to agree.
Can anyone tell me where I am going wrong?
from Numeric import searchsorted, concatenate, arange, nonzero, log, \
sum, multiply, sort, greater, take, pi, exp
from MLab import diff, randn
def hist(y, bins):
n = searchsorted(sort(y), bins)
n = diff(concatenate([n, [len(y)]]))
return n
# generate some gaussian numbers
mu = 0.0
sigma = 2.0
x = mu + sigma*randn(100000)
delta = 0.001
bins = arange(-12.0, 12.0, delta)
n = hist(x, bins)
ind = nonzero(greater(n, 0.0))
n = take(n, ind) # get the positive
n = 1.0/len(n)*n # norm for probability; is this the right normalization
#n = 1.0/len(bins)*n # or this? or something else?
Scomputed = -1.0/log(2.0) * sum(multiply(n, log(n)))
Sanalytic = 0.5/log(2.0) * log(2*pi*exp(1.0)*sigma**2)
print Scomputed, Sanalytic
Thanks!
John Hunter
More information about the Python-list
mailing list