[issue4753] Faster opcode dispatch on gcc

Sun Jan 11 18:37:23 CET 2009

Jeffrey Yasskin <jyasskin at gmail.com> added the comment:

Here's a port of threadedceval5.patch to trunk. It passes the tests. I
haven't benchmarked this exact patch, but on one Intel Core2, a similar
patch got an 11%-14% speedup (on 2to3 and pybench).

I've also cleaned up Jakob Sievers' vmgen patch (representing
forth-style dispatch) a bit so that it passes all the tests, and on the
same machine it got a 13%-17% speedup. The vmgen patch is not quite at
feature parity (it throws out support for LLTRACE and a couple other
#defines), and there are fairly good arguments against committing it to
python at all (it requires installing and modifying vmgen to build), but
I'll post it after I've ported it to trunk.

Re skip and paolo: JITting and machine-specific assembly will probably
be important to speeding up Python in the long run, but they'll also
take a long while to get right, so we shouldn't let them distract us
from committing the dispatch optimization.

Added file: http://bugs.python.org/file12687/pitrou_dispatch_2.7.patch

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue4753>
_______________________________________