Strange array.array performance

Maxim Khitrov mkhitrov at gmail.com
Thu Feb 19 20:39:08 EST 2009


On Thu, Feb 19, 2009 at 7:01 PM, Scott David Daniels
<Scott.Daniels at acm.org> wrote:
> Maxim Khitrov wrote:
>>
>> On Thu, Feb 19, 2009 at 2:35 PM, Robert Kern <robert.kern at gmail.com>
>> wrote:
>> I have, but numpy is not currently available for python 2.6, which is
>> what I need for some other features, and I'm trying to keep the
>> dependencies down in any case....
>> The only feature that I'm missing with array.array is the ability to
>> quickly pre-allocate large chunks of memory. To do that right now I'm
>> using array('d', (0,) * size). It would be nice if array accepted an
>> int as the second argument indicating how much memory to allocate and
>> initialize to 0.
>
> In the meantime, you could write a function (to ease the shift to numpy)
> and reduce your interface problem to a very small set of lines:
>    def zeroes_d(n):
>        '''Allocate a n-element vector of 'd' elements'''
>        vector = array.array('d') # fromstring has no performance bug
>        vector.fromstring(n * 8 * '\0')
>        return vector
> Once numpy is up and running on 2.6, this should be easy to convert
> to a call to zeroes.

If I do decide to transition at any point, it will require much
greater modification. For example, to speed-up retrieval of data from
Matlab, which is returned to me as an mxArray structure, I allocate an
array.array for it and then use ctypes.memmove to copy data directly
into the array's buffer (address obtained through buffer_info()).

Same thing for sending data, rather than allocate a separate mxArray,
copy data, and then send, I create an empty mxArray and set its data
pointer to the array's buffer. I'm sure that there are equivalents in
numpy, but the point is that the transition, which currently would not
benefit my code in any significant way, will not be a quick change.

On the other hand, I have to thank you for the fromstring example. For
some reason, it never occurred to me that creating a string of nulls
would be much faster than a tuple of zeros. In fact, you can pass the
string to the constructor and it calls fromstring automatically. For
an array of 1 million elements, using a string to initialize is 18x
faster. :)

- Max



More information about the Python-list mailing list