[Numpy-discussion] A numpy accumulator...

Mon Oct 5 17:34:40 EDT 2009

2009/10/5 Christopher Barker <Chris.Barker at noaa.gov>:
> Francesc Alted wrote:
>> A Saturday 03 October 2009 10:06:12 Christopher Barker escrigué:
>>> This idea was inspired by a discussion at the SciPy conference, in which
>>> we spent a LOT of time during the numpy tutorial talking about how to
>>> accumulate values in an array when you don't know how big the array
>>> needs to be when you start.
>
>>> What I have in mind is very simple. It would be:
>>>    - Only 1-d
>>>    - Support append() and extend() methods
>>>    - support indexing and slicing
>>>    - Support any valid numpy dtype
>>>      - which could even get you pseudo n-d arrays...
>>>    - maybe it would act like an array in other ways, I'm not so sure.
>>>      - ufuncs, etc.
>
>> That's interesting.  I'd normally use the `resize()` method for what you want,
>> but indeed your approach is way more easy-to-use.
>
> Of course, this is using resize() under the hood, but giving it an
> easier interface, but more importantly, it's adding the pre-allocation
> for you, and the code to deal with that. I suppose I should benchmark
> it, but I think calling resize(0 with every append would be a lot slower
> (though maybe not -- might the compiler/os be pre-allocating some extra
> memory anyway?)

I looked into this at some point, and under Linux, the malloc doesn't
allocate substantial extra memory until you get big enough that it's
allocating complete memory pages, at which point you get until the end
of the page. At this point it's possible that adding more memory onto
the end of the malloced region (and maybe even moving the array around
in memory) can become really cheap, since it's just requesting more
memory from the OS. Also, a friend who's a bare-metal programming
wizard pointed out to me that modern malloc implementations really
hate realloc, since it tends to put memory blocks in arenas intended
for different sizes. I think that's only really an issue for shrinking
blocks, since they probably just always allocate a new block when
growing (unless they're in the pages-at-a-time regime).

In short, I think it's better to have a python-list-like growing
scheme. In fact it's maybe more important for arrays than python
lists, since in a python list all that needs to be moved are pointers
to the actual python objects, only ever a small fraction of the data
volume.

> I should profile this -- if you can call resize() with every new item,
> and it's not too slow, then it may not be worth writing this class at
> all (or I could make it simpler, maybe even an nd-array subclass instead.

Keep in mind the need for sensible handling of slices, since the
underlying array will probably move on every resize. I think there's a
need for this code.

>> If you are looking for performance improvements, I'd have a look at the
>> `PyArray_Resize()` function in 'core/src/multiarray/shape.c' (trunk).  It
>> seems to me that the zero-initialization of added memory can be skipped,
>> allowing for more performance for the `resize()` method (most specially for
>> large size increments).
>
> I suppose so, but I doubt that's causing any of my performance issues.
> Another thing to profile.

Probably worth profiling, yes - I wouldn't worry about the time taken
writing zeros, but that does mean you have to touch all the allocated
memory, which can't be too great for the cache.

Anne