![](https://secure.gravatar.com/avatar/af6c39d6943bd4b0e1fde23161e7bb8c.jpg?s=120&d=mm&r=g)
Hi, Does anybody know how Zipf's law or how Zipfian distributions work, and how they relate to NumPy's `np.random.zipf`? I'm afraid I can't make head or tail of these results: In [106]: np.random.zipf(2, size=(10)) Out[106]: array([ 1, 1, 1, 29, 1, 1, 1, 1, 1, 2]) (8x1, 1x2, 1x29) In [107]: np.random.zipf(2, size=(10)) Out[107]: array([75, 1, 1, 3, 1, 1, 1, 1, 1, 4]) (7x1, 1x3, 1x4, 1x75) In [108]: np.random.zipf(2, size=(10)) Out[108]: array([ 6, 17, 2, 1, 1, 2, 1, 20, 1, 2]) (4x1, 3x2, 1x6, 1x17, 1x20) Thanks! Stéfan
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
On Thu, Jul 24, 2008 at 10:15, Stéfan van der Walt <stefan@sun.ac.za> wrote:
Hi,
Does anybody know how Zipf's law or how Zipfian distributions work, and how they relate to NumPy's `np.random.zipf`? I'm afraid I can't make head or tail of these results:
In [106]: np.random.zipf(2, size=(10)) Out[106]: array([ 1, 1, 1, 29, 1, 1, 1, 1, 1, 2])
(8x1, 1x2, 1x29)
In [107]: np.random.zipf(2, size=(10)) Out[107]: array([75, 1, 1, 3, 1, 1, 1, 1, 1, 4])
(7x1, 1x3, 1x4, 1x75)
In [108]: np.random.zipf(2, size=(10)) Out[108]: array([ 6, 17, 2, 1, 1, 2, 1, 20, 1, 2])
(4x1, 3x2, 1x6, 1x17, 1x20)
With only 10 samples a piece, it's hard to evaluate what's going on. zipf(s) samples from a Zipfian distribution with N=inf, using the terminology as in the Wikipedia article: http://en.wikipedia.org/wiki/Zipf%27s_law It's a long-tailed distribution, so you would expect to see one or two big numbers with s=2. For example, here is the survival function for the distribution (sf(x) = 1-cdf(x)). In [23]: from numpy import * In [24]: def harmonic_number(s, k): ....: x = 1.0 / arange(1,k+1) ** s ....: return x.sum() ....: In [25]: from scipy.special import zeta In [26]: def sf(x,s): ....: return 1.0 - harmonic_number(s, int(x)) / zeta(s,1) ....: In [27]: sf(10, 2.0) Out[27]: 0.057854194645034718 In [28]: sf(20, 2.0) Out[28]: 0.029649105042033996 In [29]: sf(60, 2.0) Out[29]: 0.010048153098031198 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
participants (2)
-
Robert Kern
-
Stéfan van der Walt