[Numpy-discussion] advanced indexing bug with huge arrays?

David Warde-Farley wardefar at iro.umontreal.ca
Tue Jan 24 11:19:21 EST 2012


On Tue, Jan 24, 2012 at 09:15:01AM +0000, Robert Kern wrote:
> On Tue, Jan 24, 2012 at 08:37, Sturla Molden <sturla at molden.no> wrote:
> > On 24.01.2012 09:21, Sturla Molden wrote:
> >
> >> randomkit.c handles C long correctly, I think. There are different codes
> >> for 32 and 64 bit C long, and buffer sizes are size_t.
> >
> > distributions.c take C longs as parameters e.g. for the binomial
> > distribution. mtrand.pyx correctly handles this, but it can give an
> > unexpected overflow error on 64-bit Windows:
> >
> >
> > In [1]: np.random.binomial(2**31, .5)
> > ---------------------------------------------------------------------------
> > OverflowError                             Traceback (most recent call last)
> > C:\Windows\system32\<ipython-input-1-000aa0626c42> in <module>()
> > ----> 1 np.random.binomial(2**31, .5)
> >
> > C:\Python27\lib\site-packages\numpy\random\mtrand.pyd in
> > mtrand.RandomState.binomial (numpy\random\mtrand\mtrand.c:13770)()
> >
> > OverflowError: Python int too large to convert to C long
> >
> >
> > On systems where C longs are 64 bit, this is likely not to produce an
> > error.
> >
> > This begs the question if also randomkit.c and districutions.c should be
> > changed to use npy_intp for consistency across all platforms.
> 
> There are two different uses of long that you need to distinguish. One
> is for sizes, and one is for parameters and values. The sizes should
> definitely be upgraded to npy_intp. The latter shouldn't; these should
> remain as the default integer type of Python and numpy, a C long.

Hmm. Seeing as the width of a C long is inconsistent, does this imply that
the random number generator will produce different results on different
platforms? Or do the state dynamics prevent it from ever growing in magnitude
to the point where that's an issue?

David



More information about the NumPy-Discussion mailing list