[Numpy-discussion] Speedup by avoiding memory alloc twice in scalar array

Frédéric Bastien nouiz at nouiz.org
Tue Jul 16 14:53:30 EDT 2013


Hi,


On Tue, Jul 16, 2013 at 11:55 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Tue, Jul 16, 2013 at 2:34 PM, Arink Verma <arinkverma at gmail.com> wrote:
>
>> >Each ndarray does two mallocs, for the obj and buffer. These could be
>> combined into 1 - just allocate the total size and do some pointer
>> >arithmetic, then set OWNDATA to false.
>> So, that two mallocs has been mentioned in project introduction. I got
>> that wrong.
>>
>
> On further thought/reading the code, it appears to be more complicated
> than that, actually.
>
> It looks like (for a non-scalar array) we have 2 calls to PyMem_Malloc: 1
> for the array object itself, and one for the shapes + strides. And, one
> call to regular-old malloc: for the data buffer.
>
> (Mysteriously, shapes + strides together have 2*ndim elements, but to hold
> them we allocate a memory region sized to hold 3*ndim elements. I'm not
> sure why.)
>
> And contrary to what I said earlier, this is about as optimized as it can
> be without breaking ABI. We need at least 2 calls to malloc/PyMem_Malloc,
> because the shapes+strides may need to be resized without affecting the
> much larger data area. But it's tempting to allocate the array object and
> the data buffer in a single memory region, like I suggested earlier. And
> this would ALMOST work. But, it turns out there is code out there which
> assumes (whether wisely or not) that you can swap around which data buffer
> a given PyArrayObject refers to (hi Theano!). And supporting this means
> that data buffers and PyArrayObjects need to be in separate memory regions.
>

Are you sure that Theano "swap" the data ptr of an ndarray? When we play
with that, it is on a newly create ndarray. So a node in our graph, won't
change the input ndarray structure. It will create a new ndarray structure
with new shape/strides and pass a data ptr and we flag the new ndarray with
own_data correctly to my knowledge.

If Theano pose a problem here, I'll suggest that I fix Theano. But
currently I don't see the problem. So if this make you change your mind
about this optimization, tell me. I don't want Theano to prevent
optimization in NumPy.

Fred
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130716/87f1cbf1/attachment.html>


More information about the NumPy-Discussion mailing list