[Python-Dev] Speeding up CPython 5-10%
Yury Selivanov
yselivanov.ml at gmail.com
Wed Jan 27 15:37:49 EST 2016
On 2016-01-27 3:10 PM, Damien George wrote:
> Hi Yuri,
>
> I think these are great ideas to speed up CPython. They are probably
> the simplest yet most effective ways to get performance improvements
> in the VM.
Thanks!
>
> MicroPython has had LOAD_METHOD/CALL_METHOD from the start (inspired
> by PyPy, and the main reason to have it is because you don't need to
> allocate on the heap when doing a simple method call). The specific
> opcodes are:
>
> LOAD_METHOD # same behaviour as you propose
> CALL_METHOD # for calls with positional and/or keyword args
> CALL_METHOD_VAR_KW # for calls with one or both of */**
>
> We also have LOAD_ATTR, CALL_FUNCTION and CALL_FUNCTION_VAR_KW for
> non-method calls.
Yes, we'll need to add CALL_METHOD{_VAR|_KW|etc} opcodes to optimize all
kind of method calls. However, I'm not sure how big the impact will be,
need to do more benchmarking.
BTW, how do you benchmark MicroPython?
>
> MicroPython also has dictionary lookup caching, but it's a bit
> different to your proposal. We do something much simpler: each opcode
> that has a cache ability (eg LOAD_GLOBAL, STORE_GLOBAL, LOAD_ATTR,
> etc) includes a single byte in the opcode which is an offset-guess
> into the dictionary to find the desired element. Eg for LOAD_GLOBAL
> we have (pseudo code):
>
> CASE(LOAD_GLOBAL):
> key = DECODE_KEY;
> offset_guess = DECODE_BYTE;
> if (global_dict[offset_guess].key == key) {
> // found the element straight away
> } else {
> // not found, do a full lookup and save the offset
> offset_guess = dict_lookup(global_dict, key);
> UPDATE_BYTECODE(offset_guess);
> }
> PUSH(global_dict[offset_guess].elem);
>
> We have found that such caching gives a massive performance increase,
> on the order of 20%. The issue (for us) is that it increases bytecode
> size by a considerable amount, requires writeable bytecode, and can be
> non-deterministic in terms of lookup time. Those things are important
> in the embedded world, but not so much on the desktop.
That's a neat idea! You're right, it does require bytecode to become
writeable. I considered implementing a similar strategy, but this would
be a big change for CPython. So I decided to minimize the impact of the
patch and leave the opcodes untouched.
Thanks!
Yury
More information about the Python-Dev
mailing list