[Python-Dev] opcode dispatch optimization
Antoine Pitrou
solipsis at pitrou.net
Wed Dec 31 14:47:30 CET 2008
Hello,
I would like to mention that I've written a patch which enables "threaded
interpretation" on the ceval loop with gcc (*). On my computer (an Athlon X2
3600+), it is good for a 15-20% speedup of the interpreter on pystone and
pybench. I also had the opportunity to test it on a Core2-derived CPU, where it
doesn't make a difference (I conjecture it's because Core2 CPUs have
hardware-based indirect branch optimizations). It will make no difference if the
interpreter is compiled with something else than gcc (I tested on Windows).
The additional complexity is very small. There's a separate script which is run
to build the dispatch table (only if needed, that is if dis.py has been
modified). In ceval.c, there are a couple of macros and some #ifdef's. That's
all. It breaks no test in the regression suite.
Could other people test and report their results here? (the patch is for py3k,
btw). Also, what are you thoughts for/against integrating this patch in the
standard interpreter?
Regards
Antoine.
(*) please note: it has nothing to see with multithreading.
More information about the Python-Dev
mailing list