[Numpy-discussion] Memory efficient alternative for np.loadtxt and np.genfromtxt

Chris Barker chris.barker at noaa.gov
Tue Oct 28 17:41:44 EDT 2014


On Tue, Oct 28, 2014 at 1:24 PM, Nathaniel Smith <njs at pobox.com> wrote:

> > Memory efficiency -- somethign like my growable array is not all that
> hard to implement and pretty darn quick -- you just do the usual trick_
> over allocate a bit of memory, and when it gets full re-allocate a larger
> chunk.
>
> Can't you just do this with regular numpy using .resize()? What does your
> special class add? (Just curious.)
>
it used resize under the hood -- it just adds the bookeeping for the over
allocation, etc, and lets you access teh data as though it wasn't
over-allocated

like I said, not that difficult.

I haven't touched it for a while, but it you are curious I just threw it up
on gitHub:

https://github.com/PythonCHB/NumpyExtras

you want accumulator.py -- there is also a cython version that I didn't
quite finish...it theory, it should be a be faster in some cases by
reducing the need to round-trip between numpy and python data types...

in practice, I don't think I got it to a point where I could do real-world
profiling.

It's fun to sit around and brainstorm clever implementation strategies, but
> Wes already went ahead and implemented all the tricky bits, and optimized
> them too. No point in reinventing the wheel.
>
> (Plus as I pointed out upthread, it's entirely likely that this "2x
> overhead" is based on a misunderstanding/oversimplification of how virtual
> memory works, and the actual practical overhead is much lower.)
>
good point.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20141028/2034a679/attachment.html>


More information about the NumPy-Discussion mailing list