[issue42115] Caching infrastructure for the evaluation loop: specialised opcodes

Thu Oct 22 00:38:39 EDT 2020

Yury Selivanov <yselivanov at gmail.com> added the comment:

Few thoughts in no particular order:

- I'd suggest implementing the cache for 2-3 more opcodes on top of the existing infrastructure to get more experience and then refactoring it to make it more generic.

- Generalizing LOAD_METHOD to work for methods with **kwargs, caching concrete operator implementations for opcodes like BINARY_ADD etc. are all possible on top of the current infra.

- Rewriting code objects in place is wrong, IMO: you always need to have a way to deoptimize the entire thing, so you need to keep the original one. It might be that you have well defined and static types for the first 10000 invocations and something entirely different on 10001. So IMO we need a SpecializedCode object with the necessary bailout guards. But that's not a simple thing to implement, so unless someone will be working on this fulltime for a long time I'd suggest working off what we have now. (That said I'd take a close look at what Dino is building).

- There are multiple different approaches we can choose for optimizing CPython, ranging from hidden classes to a full blown JIT. I hope someone will do them one day. But IMO the current simple "opcode cache" (I wish we had a better name) mechanism we have would allow us to squeeze up to 15-25% median improvement in our benchmarks with relatively limited dev time. Maybe that's good enough for 3.10.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue42115>
_______________________________________