[Numpy-discussion] Is there a pure numpy recipe for this?

RayS rays at blue-cove.com
Thu Mar 27 10:42:00 EDT 2014


I find this interesting, since I work with medical data sets of 100s 
of MB, and regularly run into memory allocation problems when doing a 
lot of Fourrier analysis, waterfalls etc. The per-process limit seems 
to be about 1.3GB on this 6GB quad-i7 with Win7. For live data 
collection routines I simply creates zeros() of say 300MB and trim 
the array when saving to disk. memmaps are also limited to RAM, and 
take a looooong time to create (seconds). So, I've been investigating 
Pandas and segmentaxis - just a bit so far.

- Ray Schumacher


At 12:02 AM 3/27/2014, you wrote:
>Chris Barker - NOAA Federal wrote
> > note that  numpy arrays are not re-sizable, so np.append() and np.insert()
> > have to make a new array, and copy all the old data over. If you are
> > appending one at a time, this can be pretty darn slow.
> >
> > I wrote a "grow_array" class once, it was a wrapper around a numpy array
> > that pre-allocated extra data to make appending more efficient. It's kind
> > of half-baked code now, but let me know if you are interested.
>
>Hi Chris,
>
>Yes, it is a good point and I am aware of it. For some of these functions it
>would have been nice if i could have parsed a preallocated, properly sliced
>array to the functions, which i could then reuse in each iteration step.
>
>It is indeed the memory allocation which appear to take more time than the
>actual calculations.
>
>Still it is much faster to create a few arrays than to loop through a
>thousand individual elements in pure Python.
>
>Interesting with the grow_array class. I think that what I have for now is
>sufficient, but i will keep your offer in mind:)
>
>--Slaunger







More information about the NumPy-Discussion mailing list