[Numpy-discussion] numpy.random.logseries - incorrect convergence for k=1, k=2
joep
josef.pktd at gmail.com
Sat Sep 27 14:12:14 EDT 2008
random numbers generated by numpy.random.logseries do not converge to
theoretical distribution:
for probability paramater pr = 0.8, the random number generator
converges to a
frequency for k=1 at 39.8 %, while the theoretical probability mass is
49.71
k=2 is oversampled, other k's look ok
check frequency of k=1 and k=2 at N = 1000000
0.398406 0.296465
pmf at k = 1 and k=2 with formula
[ 0.4971 0.1988]
for probability paramater pr = 0.3, the results are not as bad, but
still off:
frequency for k=1 at 82.6 %, while the theoretical probability mass is
84.11
check frequency of k=1 and k=2 at N = 1000000
0.826006 0.141244
pmf at k = 1 and k=2 with formula
[ 0.8411 0.1262]
below is a quick script for checking this
Josef
{{{
import numpy as np
from scipy import stats
pr = 0.8
np.set_printoptions(precision=2, suppress=True)
# calculation for N=1million takes some time
for N in [1000, 10000, 10000, 1000000]:
rvsn=np.random.logseries(pr,size=N)
fr=stats.itemfreq(rvsn)
pmfs=stats.logser.pmf(fr[:,0],pr)*100
print 'log series sample frequency and pmf (in %) with N = ', N
print np.column_stack((fr[:,0],fr[:,1]*100.0/N,pmfs))
np.set_printoptions(precision=4, suppress=True)
print 'check frequency of k=1 and k=2 at N = ', N
print np.sum(rvsn==1)/float(N),
print np.sum(rvsn==2)/float(N)
k = np.array([1,2])
print 'pmf at k = 1 and k=2 with formula'
print -pr**k * 1.0 / k / np.log(1-pr)
}}}
More information about the NumPy-Discussion
mailing list