opcode dispatch optimization
Hello, I would like to mention that I've written a patch which enables "threaded interpretation" on the ceval loop with gcc (*). On my computer (an Athlon X2 3600+), it is good for a 15-20% speedup of the interpreter on pystone and pybench. I also had the opportunity to test it on a Core2-derived CPU, where it doesn't make a difference (I conjecture it's because Core2 CPUs have hardware-based indirect branch optimizations). It will make no difference if the interpreter is compiled with something else than gcc (I tested on Windows). The additional complexity is very small. There's a separate script which is run to build the dispatch table (only if needed, that is if dis.py has been modified). In ceval.c, there are a couple of macros and some #ifdef's. That's all. It breaks no test in the regression suite. Could other people test and report their results here? (the patch is for py3k, btw). Also, what are you thoughts for/against integrating this patch in the standard interpreter? Regards Antoine. (*) please note: it has nothing to see with multithreading.
Antoine Pitrou <solipsis <at> pitrou.net> writes:
I would like to mention that I've written a patch which enables "threaded interpretation"
... and I forgot to give the URL: http://bugs.python.org/issue4753 Regards Antoine.
Antoine Pitrou wrote:
I would like to mention that I've written a patch which enables "threaded interpretation" on the ceval loop with gcc (*). On my computer (an Athlon X2 3600+), it is good for a 15-20% speedup of the interpreter on pystone and pybench. I also had the opportunity to test it on a Core2-derived CPU, where it doesn't make a difference (I conjecture it's because Core2 CPUs have hardware-based indirect branch optimizations). It will make no difference if the interpreter is compiled with something else than gcc (I tested on Windows).
The patch makes use of a GCC feature where labels can be used as values: http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html . I didn't know about the feature and got confused by the unary && operator. A happy new your to you all! Christian
On Wed, Dec 31, 2008 at 11:44 AM, Christian Heimes <lists@cheimes.de> wrote:
The patch makes use of a GCC feature where labels can be used as values: http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html . I didn't know about the feature and got confused by the unary && operator.
Right. SpiderMonkey (Mozilla's JavaScript interpreter) does this, and it was good for a similar win on platforms that use GCC. (It took me a while to figure out why it was so much faster, so I think this patch would be better with a few very specific comments!) SpiderMonkey calls this optimization "threaded code" too, but this isn't the standard meaning of that term. See: http://en.wikipedia.org/wiki/Threaded_code -j
On Wed, 2008-12-31 at 12:51 -0600, Jason Orendorff wrote:
On Wed, Dec 31, 2008 at 11:44 AM, Christian Heimes <lists@cheimes.de> wrote:
The patch makes use of a GCC feature where labels can be used as values: http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html . I didn't know about the feature and got confused by the unary && operator.
Right. SpiderMonkey (Mozilla's JavaScript interpreter) does this, and it was good for a similar win on platforms that use GCC. (It took me a while to figure out why it was so much faster, so I think this patch would be better with a few very specific comments!)
SpiderMonkey calls this optimization "threaded code" too, but this isn't the standard meaning of that term. See: http://en.wikipedia.org/wiki/Threaded_code
FWIW, it's also explained pretty well in the first pages of [1]. WebKit's SquirrelFish is direct-threaded as well [2]. Nicolas [1] http://citeseer.ist.psu.edu/cache/papers/cs/32018/http:zSzzSzwww.jilp.orgzSz... [2] http://webkit.org/blog/189/announcing-squirrelfish/
participants (4)
-
Antoine Pitrou
-
Christian Heimes
-
Jason Orendorff
-
Nicolas Trangez