[Numpy-discussion] array of random numbers fails to construct

DAVID SAROFF (RIT Student) dps7802 at rit.edu
Sun Dec 6 16:55:09 EST 2015


Matthew,

That looks right. I'm concluding that the .astype(np.uint8) is applied
after the array is constructed, instead of during the process. This random
array is a test case. In the production analysis of radio telescope data
this is how the data comes in, and there is no  problem with 10GBy files.
linearInputData = np.fromfile(dataFile, dtype = np.uint8, count = -1)
spectrumArray = linearInputData.reshape(nSpectra,sizeSpectrum)


On Sun, Dec 6, 2015 at 4:07 PM, Matthew Brett <matthew.brett at gmail.com>
wrote:

> Hi,
>
> On Sun, Dec 6, 2015 at 12:39 PM, DAVID SAROFF (RIT Student)
> <dps7802 at rit.edu> wrote:
> > This works. A big array of eight bit random numbers is constructed:
> >
> > import numpy as np
> >
> > spectrumArray = np.random.randint(0,255, (2**20,2**12)).astype(np.uint8)
> >
> >
> >
> > This fails. It eats up all 64GBy of RAM:
> >
> > spectrumArray = np.random.randint(0,255, (2**21,2**12)).astype(np.uint8)
> >
> >
> > The difference is a factor of two, 2**21 rather than 2**20, for the
> extent
> > of the first axis.
>
> I think what's happening is that this:
>
> np.random.randint(0,255, (2**21,2**12))
>
> creates 2**33 random integers, which (on 64-bit) will be of dtype
> int64 = 8 bytes, giving total size 2 ** (21 + 12 + 6) = 2 ** 39 bytes
> = 512 GiB.
>
> Cheers,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 
David P. Saroff
Rochester Institute of Technology
54 Lomb Memorial Dr, Rochester, NY 14623
david.saroff at mail.rit.edu | (434) 227-6242
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20151206/33b1999f/attachment.html>


More information about the NumPy-Discussion mailing list