
On Thu, Mar 12, 2020 at 2:32 PM Marco Sulla via Python-ideas <python-ideas@python.org> wrote:
On Thu, 12 Mar 2020 at 21:22, Chris Angelico <rosuav@gmail.com> wrote:
They actually ARE already discarded
0____O You're right. So *how* can juliantaylor said he measured a speedup of 2x for large ndarrays?
I think that currently you have roughly the following situation with abcd = a+b+c+d: temp_ab = a+b (with a and b having refcounts of 2 or more) temp_abc = temp_ab+c (with temp_ab having a refcount of 1, and c a refcount of 2 or more) del temp_ab abcd = temp_abc+d (with temp_abc having a refcount of 1, and d a refcount of 2 or more) del temp_abc The temp_ab+c and temp_abc+d computations can reuse the temporary arrays, but only if you walk the stack to verify that there are no hidden references, which is slow, non-portable and "terrifying" to quote one comment on numpy bug #7997. If you can't reuse the arrays, then you have to allocate a third hunk of RAM for the result, which is slower. If there were better information about "temporariness", then the temporaries could be reused without the expensive and complicated stack walk, which would allow this optimization to benefit smaller arrays. If there were del expressions, then (del a)+b+c+d would allow reuse in the a+b expression as well - but only if you walked the stack or had temporariness information. So I think that del expressions are orthogonal to the stack tracing hack in terms of optimization potential. C++11 introduced temporariness information through the type system with rvalue references. I don't know how you'd do it in Python. Of course, you can get the same effect in Python right now with a += b; a += c; a += d; abcd = a; del a