Re: [Python-Dev] Speeding up CPython 5-10%

29 Jan 2016

      On 2016-01-29 5:00 AM, Stefan Behnel wrote:
...
Yury Selivanov schrieb am 27.01.2016 um 19:25:
...
[..]
LOAD_METHOD looks at the object on top of the stack, and checks if the name
resolves to a method or to a regular attribute.  If it's a method, then we
push the unbound method object and the object to the stack.  If it's an
attribute, we push the resolved attribute and NULL.
When CALL_METHOD looks at the stack it knows how to call the unbound method
properly (pushing the object as a first arg), or how to call a regular
callable.
This idea does make CPython faster around 2-4%.  And it surely doesn't make
it slower.  I think it's a safe bet to at least implement this optimization
in CPython 3.6.
So far, the patch only optimizes positional-only method calls. It's
possible to optimize all kind of calls, but this will necessitate 3 more
opcodes (explained in the issue).  We'll need to do some careful
benchmarking to see if it's really needed.
I implemented a similar but simpler optimisation in Cython a while back:
http://blog.behnel.de/posts/faster-python-calls-in-cython-021.html
Instead of avoiding the creation of method objects, as you proposed, it
just normally calls getattr and if that returns a bound method object, it
uses inlined calling code that avoids re-packing the argument tuple.
Interestingly, I got speedups of 5-15% for some of the Python benchmarks,
but I don't quite remember which ones (at least raytrace and richards, I
think), nor do I recall the overall gain, which (I assume) is what you are
referring to with your 2-4% above. Might have been in the same order.
That's great!

I'm still working on the patch, but so far it looks like adding
just LOAD_METHOD/CALL_METHOD (that avoid instantiating BoundMethods)
gives us 10-15% faster method calls.

Combining them with my opcode cache makes them 30-35% faster.

Yury