[Python-Dev] Speeding up CPython 5-10%

Yury Selivanov yselivanov.ml at gmail.com
Wed Jan 27 15:37:49 EST 2016



On 2016-01-27 3:10 PM, Damien George wrote:
> Hi Yuri,
>
> I think these are great ideas to speed up CPython.  They are probably
> the simplest yet most effective ways to get performance improvements
> in the VM.

Thanks!

>
> MicroPython has had LOAD_METHOD/CALL_METHOD from the start (inspired
> by PyPy, and the main reason to have it is because you don't need to
> allocate on the heap when doing a simple method call).  The specific
> opcodes are:
>
> LOAD_METHOD # same behaviour as you propose
> CALL_METHOD # for calls with positional and/or keyword args
> CALL_METHOD_VAR_KW # for calls with one or both of */**
>
> We also have LOAD_ATTR, CALL_FUNCTION and CALL_FUNCTION_VAR_KW for
> non-method calls.

Yes, we'll need to add CALL_METHOD{_VAR|_KW|etc} opcodes to optimize all 
kind of method calls.  However, I'm not sure how big the impact will be, 
need to do more benchmarking.

BTW, how do you benchmark MicroPython?

>
> MicroPython also has dictionary lookup caching, but it's a bit
> different to your proposal.  We do something much simpler: each opcode
> that has a cache ability (eg LOAD_GLOBAL, STORE_GLOBAL, LOAD_ATTR,
> etc) includes a single byte in the opcode which is an offset-guess
> into the dictionary to find the desired element.  Eg for LOAD_GLOBAL
> we have (pseudo code):
>
> CASE(LOAD_GLOBAL):
> key = DECODE_KEY;
> offset_guess = DECODE_BYTE;
> if (global_dict[offset_guess].key == key) {
>      // found the element straight away
> } else {
>      // not found, do a full lookup and save the offset
>      offset_guess = dict_lookup(global_dict, key);
>      UPDATE_BYTECODE(offset_guess);
> }
> PUSH(global_dict[offset_guess].elem);
>
> We have found that such caching gives a massive performance increase,
> on the order of 20%.  The issue (for us) is that it increases bytecode
> size by a considerable amount, requires writeable bytecode, and can be
> non-deterministic in terms of lookup time.  Those things are important
> in the embedded world, but not so much on the desktop.

That's a neat idea!  You're right, it does require bytecode to become 
writeable.  I considered implementing a similar strategy, but this would 
be a big change for CPython.  So I decided to minimize the impact of the 
patch and leave the opcodes untouched.


Thanks!
Yury


More information about the Python-Dev mailing list