[Numpy-discussion] Speedup by avoiding memory alloc twice in scalar array

Nathaniel Smith njs at pobox.com
Tue Jul 16 07:10:30 EDT 2013


On 16 Jul 2013 11:35, "Arink Verma" <arinkverma at gmail.com> wrote:
>
> Hi,
>
> I am working on performance parity between numpy scalar/small array and
python array as GSOC mentored By Charles.
>
> Currently I am looking at PyArray_Return, which allocate separate memory
just for scalar return. Unlike python which allocate memory once  for
returning result of  scalar operations; numpy calls malloc twice once for
the array object itself, and a second time for the array data.
>
> These memory allocations are happening in PyArray_NewFromDescr and
PyArray_Scalar. Stashing both within a single allocation would be more
efficient.
> In, PyArray_Scalar, new struct (PyLongScalarObject) need allocation in
case of scalar arrays.  Instead, can we just some how convert/cast
PyArrayObject to
> PyLongScalarObject.??

I think there are more than 2 mallocs you're talking about here?

Each ndarray does two mallocs, for the obj and buffer. These could be
combined into 1 - just allocate the total size and do some pointer
arithmetic, then set OWNDATA to false.

Converting array to scalar does more allocations. I doubt there's a way to
avoid these, but can't say for sure (on my phone now). In any case the idea
of the project is to make scalars obsolete by making arrays competitive,
right? So no need to go optimizing the competition ;-). (And more
seriously, this slowdown *only* exists because of the array/scalar split,
so ignoring it is fair.)

In the bigger picture, these are pretty tiny optimizations, aren't they? In
the quick profiling I did a while ago, it looked like there was a lot of
much bigger low-hanging fruit, and fiddling around with one malloc versus
two isn't going to do much if we're still wasting an order of magnitude
more time in inefficient loop selection and unnecessary writes to the FP
control word?

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130716/4a519eee/attachment.html>


More information about the NumPy-Discussion mailing list