[pypy-dev] Poor performance with custom bytecode

Antonio Cuni anto.cuni at gmail.com
Fri Feb 17 14:18:14 CET 2012

Hello Timothy,

On 02/17/2012 02:03 PM, Timothy Baldridge wrote:

> clojure.examples.factorial=>  (dis.dis *)
>    0           0 LOAD_FAST                0 (__argsv__)
>                3 LOAD_ATTR                0 (__len__)
>                6 CALL_FUNCTION            0

I didn't look in depth at the bytecode produced by your compiler, but this is 
very sub-optimal.
In pypy we have a custom opcode to call methods, which is much faster than 
LOAD_ATTR/CALL_FUNCTION. See e.g. how this piece of code gets compiled:

 >>>> def foo(x):
....     return foo.__len__()
 >>>> import dis
 >>>> dis.dis(foo)
   2           0 LOAD_GLOBAL              0 (foo)
               3 LOOKUP_METHOD            1 (__len__)
               6 CALL_METHOD              0
               9 RETURN_VALUE

In general, I suggest to use the jitviewer to look at which code the JIT 
generates: it shows you how many low level operations are emitted for each 
opcode, and you can compare with the same algorithm written in python to see 
what causes the most slowdown.


More information about the pypy-dev mailing list