[Numpy-discussion] allocated memory cache for numpy

Nathaniel Smith njs at pobox.com
Mon Feb 17 18:56:31 EST 2014


On Mon, Feb 17, 2014 at 3:55 PM, Stefan Seefeld <stefan at seefeld.name> wrote:
> On 02/17/2014 03:42 PM, Nathaniel Smith wrote:
>> Another optimization we should consider that might help a lot in the
>> same situations where this would help: for code called from the
>> cpython eval loop, it's afaict possible to determine which inputs are
>> temporaries by checking their refcnt. In the second call to __add__ in
>> '(a + b) + c', the temporary will have refcnt 1, while the other
>> arrays will all have refcnt >1. In such cases (subject to various
>> sanity checks on shape, dtype, etc) we could elide temporaries by
>> reusing the input array for the output. The risk is that there may be
>> some code out there that calls these operations directly from C with
>> non-temp arrays that nonetheless have refcnt 1, but we should at least
>> investigate the feasibility. E.g. maybe we can do the optimization for
>> tp_add but not PyArray_Add.
>
> For element-wise operations such as the above, wouldn't it be even
> better to use loop fusion, by evaluating the entire compound expression
> per element, instead of each individual operation ? That would require
> methods such as __add__ to return an operation object, rather than the
> result value. I believe a technique like that is used in the numexpr
> package (https://github.com/pydata/numexpr), which I saw announced here
> recently...

Hi Stefan (long time no see!),

Sure, that would be an excellent thing, but adding a delayed
evaluation engine to numpy is a big piece of new code, and we'd want
to make it something you opt-in to explicitly. (There are too many
weird potential issues with e.g. errors showing up at some far away
place from the actual broken code, due to evaluation being delayed to
there.) By contrast, the optimization suggested here is a tiny change
we could do now, and would still be useful even in the hypothetical
future where we do have lazy evaluation, for anyone who doesn't use
it.

-n



More information about the NumPy-Discussion mailing list