[Numpy-discussion] advanced indexing bug with huge arrays?

Christoph Gohlke cgohlke at uci.edu
Mon Jan 23 16:08:03 EST 2012



On 1/23/2012 12:33 PM, David Warde-Farley wrote:
> On Mon, Jan 23, 2012 at 08:38:44PM +0100, Robin wrote:
>> On Mon, Jan 23, 2012 at 7:55 PM, David Warde-Farley
>> <wardefar at iro.umontreal.ca>  wrote:
>>> I've reproduced this (rather serious) bug myself and confirmed that it exists
>>> in master, and as far back as 1.4.1.
>>>
>>> I'd really appreciate if someone could reproduce and confirm on another
>>> machine, as so far all my testing has been on our single high-memory machine.
>>
>> I see the same behaviour on a Winodows machine with numpy 1.6.1. But I
>> don't think it is an indexing problem - rather something with the
>> random number creation. a itself is already zeros for high indexes.
>> 
>> In [8]: b[1000000:1000010]
>> Out[8]:
>> array([3429029, 1251819, 4292918, 2249483,  757620, 3977130, 3455449,
>>         2005054, 2565207, 3114930])
>>
>> In [9]: a[b[1000000:1000010]]
>> Out[9]:
>> array([[0, 0, 0, ..., 0, 0, 0],
>>         [0, 0, 0, ..., 0, 0, 0],
>>         [0, 0, 0, ..., 0, 0, 0],
>>         ...,
>>         [0, 0, 0, ..., 0, 0, 0],
>>         [0, 0, 0, ..., 0, 0, 0],
>>         [0, 0, 0, ..., 0, 0, 0]], dtype=uint8)
>>
>> In [41]: a[581350:,0].sum()
>> Out[41]: 0
>
> Hmm, this seems like a separate bug to mine. In mine, 'a' is indeed being
> filled in -- the problem arises with c alone.
>
> So, another Windows-specific bug to add to the pile, perhaps? :(
>
> David


Maybe this explains the win-amd64 behavior: There are a couple of places 
in mtrand where array indices and sizes are C long instead of npy_intp, 
for example in the randint function:

<https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/mtrand.pyx#L863>

Christoph



More information about the NumPy-Discussion mailing list