efficiently create and fill array.array from C code?

Mon Jun 14 15:47:33 EDT 2010

Thomas Jollans <thomas at jollans.com> writes:

> On 06/14/2010 01:18 PM, Hrvoje Niksic wrote:
>> Thomas Jollans <thomas at jollans.com> writes:
>> 
>>> 1. allocate a buffer of a certain size
>>> 2. fill it
>>> 3. return it as an array.
>> 
>> The fastest and more robust approach (I'm aware of) is to use the
>> array.array('typecode', [0]) * size idiom to efficiently preallocate the
>> array, and then to get hold of the pointer pointing into array data
>> using the buffer interface.
>
> Ah, create a single-element array, and multiply that. That's not a bad
> approach, the overhead is probably equivalent to what I have now:
> currently, I create an uninitialized(!) bytes of the correct size, fill
> it myself, and initialize an array from that.  Both approaches have the
> overhead of creating one extra Python object (bytes/single-element
> array) and either copying one element over and over, or memcpy'ing the
> whole buffer.

If I understand your approach correctly, it requires both the C buffer
and the full-size array.array to be present in memory at the same time,
so that you can memcpy the data from one to the other.  Multiplying the
single-element array does needlessly copy the initial element over and
over (doing so in reasonably efficient C), but has the advantage that it
allows the larger array to be overwritten in-place.

Numpy arrays allow for creation of arrays out of uninitialized memory,
which avoids the initial overhead - at the cost of depending on numpy,
of course.