[Numpy-discussion] Pre-allocate array

Nikolaus Rath Nikolaus at rath.org
Thu Dec 27 11:44:12 EST 2012


Hello,

I have an array that I know will need to grow to X elements. However, I
will need to work with it before it's completely filled. I see two ways
of doing this:

bigarray = np.empty(X)
current_size = 0
for i in something:
    buf = produce_data(i)
    bigarray[current_size:current_size+len(buf)] = buf
    current_size += len(buf)
    # Do things with bigarray[:current_size]

This avoids having to allocate new buffers and copying data around, but
I have to separately manage the current array size. Alternatively, I
could do

bigarray = np.empty(0)
current_size = 0
for i in something:
    buf = produce_data(i)
    bigarray.resize(len(bigarray)+len(buf))
    bigarray[-len(buf):] = buf
    # Do things with bigarray

this is much more elegant, but the resize() calls may have to copy data
around.

Is there any way to tell numpy to allocate all the required memory while
using only a part of it for the array? Something like:

bigarray = np.empty(50, will_grow_to=X)
bigarray.resize(X) # Guaranteed to work without copying stuff  around


Thanks,
-Nikolaus





More information about the NumPy-Discussion mailing list