[Python-Dev] [numpy wishlist] Interpreter support for temporary elision in third-party classes

Nathaniel Smith njs at pobox.com
Fri Jun 6 03:47:50 CEST 2014


On 5 Jun 2014 23:58, "Terry Reedy" <tjreedy at udel.edu> wrote:
>
> On 6/5/2014 4:51 PM, Nathaniel Smith wrote:
>
>> In fact, AFAICT it's 100% correct for libraries being called by
>> regular python code (which is why I'm able to quote benchmarks at you
>> :-)). The bytecode eval loop always holds a reference to all operands,
>> and then immediately DECREFs them after the operation completes. If
>> one of our arguments has no other references besides this one, then we
>> can be sure that it is a dead obj walking, and steal its corpse.
>>
>> But this has a fatal flaw: people are unreasonable creatures, and
>> sometimes they call Python libraries without going through ceval.c
>> :-(. It's legal for random C code to hold an array object with a
>> single reference count, and then call PyNumber_Add on it, and then
>> expect the original array object to still be valid. But who writes
>> code like that in practice? Well, Cython does. So, this is no-go.
>
>
> I understand that a lot of numpy/scipy code is compiled with Cython, so
you really want the optimization to continue working when so compiled. Is
there a simple change to Cython that would work, perhaps in coordination
with a change to numpy? Is so, you could get the result before 3.5 comes
out.

Unfortunately we don't actually know whether Cython is the only culprit
(such code *could* be written by hand), and even if we fixed Cython it
would take some unknowable amount of time before all downstream users
upgraded their Cythons. (It's pretty common for projects to check in
Cython-generated .c files, and only regenerate when the Cython source
actually gets modified.) Pretty risky for an optimization.

> I realized that there are other compilers than Cython and non-numpy code
that could benefit, so that a more generic solution would also be good. In
particular
>
> > Here's the idea. Take an innocuous expression like:
> >
> >     result = (a + b + c) / c
> >
> > This gets evaluated as:
> >
> >     tmp1 = a + b
> >     tmp2 = tmp1 + c
> >     result = tmp2 / c
> ...
>
> > There's an obvious missed optimization in this code, though, which is
> > that it keeps allocating new temporaries and throwing away old ones.
> > It would be better to just allocate a temporary once and re-use it:
> >     tmp1 = a + b
> >     tmp1 += c
> >     tmp1 /= c
> >     result = tmp1
>
> Could this transformation be done in the ast? And would that help?

I don't think it could be done in the ast because I don't think you can
work with anonymous temporaries there. But, now that you mention it, it
could be done on the fly in the implementation of the relevant opcodes.
I.e., BIN_ADD could do

if (Py_REFCNT(left) == 1)
    result = PyNumber_InPlaceAdd(left, right);
else
    result = PyNumber_Add(left, right)

Upside: all packages automagically benefit!

Potential downsides to consider:
- Subtle but real and user-visible change in Python semantics. I'd be a
little nervous about whether anyone has implemented, say, an iadd with side
effects such that you can tell whether a copy was made, even if the object
being copied is immediately destroyed. Maybe this doesn't make sense though.
- Only works when left operand is the temporary ("remember that a*b+c is
faster than c+a*b"), and only for arithmetic (no benefit for np.sin(a +
b)). Probably does cover the majority of cases though.

> A prolonged discussion might be better on python-ideas. See what others
say.

Yeah, I wasn't sure which list to use for this one, happy to move if it
would work better.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140606/9c727006/attachment.html>


More information about the Python-Dev mailing list