[Numpy-discussion] Memory efficient alternative for np.loadtxt and np.genfromtxt
RayS
rays at blue-cove.com
Sun Oct 26 14:40:52 EDT 2014
At 06:32 AM 10/26/2014, you wrote:
>On Sun, Oct 26, 2014 at 1:21 PM, Eelco Hoogendoorn
><hoogendoorn.eelco at gmail.com> wrote:
> > Im not sure why the memory doubling is necessary. Isnt it possible to
> > preallocate the arrays and write to them?
>
>Not without reading the whole file first to know how many rows to preallocate
Seems to me that loadtxt()
http://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html
should have an optional shape. I often know how many rows I have (#
of samples of data) from other meta data.
Then:
- if the file is smaller for some reason (you're not sure and pad
your estimate) it could do one of
- zero pad array
- raise()
- return truncated view
- if larger
- raise()
- return data read (this would act like fileObject.read( size ) )
- Ray S
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20141026/19921621/attachment.html>
More information about the NumPy-Discussion
mailing list