[Python-Dev] Python 3 optimizations...

Fri Jul 23 10:58:41 CEST 2010

> This sounds like wpython (a CPython derivative with a wider set of byte code
> commands) could benefit from it.
>
I am aware of the wpython project of Cesare di Mauro. I change the
instruction format from bytecode to wordcode, too (because it allows
for more efficient instruction decoding). Contrary to his approach,
however, I do not change the instruction encoding to pack in
additional optimizations. (I hope to have put that correctly; I have
seen his slides about a year ago.)


> Do I understand correctly that you modify the byte code of modules/functions
> at runtime?
>
Yes. Quickening is runtime only optimization technique that rewrites
instructions from a generic instruction to an optimized derivative
(orignally for the Java virtual machine). It is completely hidden from
the compiler and has no other dependencies than the interpreter
dispatch routine itself.


> Ah, yes, that makes good sense. So you basically add an intermediate step to
> calls that provides faster dispatch for known C functions.
>
Exactly. I also contemplated to provide optimized derivatives for all
builtin functions, but never implemented it (lack of time). Based on
quantitative analysis of usage frequency one could very well decide
to, e.g., provide an optimized CALL_FUNCTION derivative for the "len"
function.
Another benefit of using my technique is that a compiler could decide
to inline all of the functions of the optimized derivatives (e.g., the
float_add function call inside my FLOAT_ADD interpreter instruction).
Unfortunately, however, gcc currently does not allow for cross-module
inlining (AFAIR). (Preliminary tests with manually changing the
default inlining size for ceval.c resulted in speedups of up to 1.3 on
my machine, so I think inlinling of function bodies for the optimized
derivatives would boost performance noticeably.)


> I'm interested in the code that determines what can be optimised in what
> way. I read that Jython recently received a contribution that provides type
> information for lots of modules and builtins, but having something like that
> for CPython would be cool.
>
Ok. For this year's PPPJ I wanted to submit a paper realizing my
optimization in Jython. Because of bytecode-rewriting tools, the
interpreter could decide at runtime which optimized derivatives to
generate and add rewriting code that supports the changing instruction
set. Either way (static pre-compiling or dynamic bytecode rewriting
that is), I think that Jython and IronPython would greatly benefit
from applying this optimization technique, because their JIT compilers
would inline the function calls with a high likelihood.


> Such an approach would also be very useful for Cython. Think of a profiler
> that runs a program in CPython and tells you exactly what static type
> annotations to put where in your Python code to make it compile to a fast
> binary with Cython. Or, even better, it could just spit out a .pxd file that
> you drop next to your .py file and that provides the static type information
> for you.
>
Hm, I think you could very easily save away the optimized bytecode
sequence for function calls that would allow you to do that (e.g., you
could save something similar to:
   LOAD_FAST
   LOAD_CONST
   LONG_ADD
or
   LOAD_GLOBAL
   CALL_C_ZERO
)


Cheers,
--stefan