[pypy-dev] Ann: Revival of the BytecodeHacks
tismer at stackless.com
Fri Jul 30 03:08:49 CEST 2004
I could not resist to do this announcement, although
this project belongs to Michael Hudson.
Mike, please forgive me. I owe you a beer.
In a single day-and-night session, I hacked against bytecodehacks,
to upgrade it from Python 1.5.2 to Python 2.3 .
This was quite some work, although not so very much, due to
the excellent basic layout which Michael created.
What are the Bytecodehacks?
The bytecodehacks allow you to do certain modifications to
compiled Python code objects. There are lots of applications
included, like macro expansions and function inlining,
things which Python does not provide and also is not supposed
to provide. This kind of madness exists for people who ask
for it, without worring those who don't care.
Again, this great package was written by Michael Hudson in early
2000, and to my knowledge, it was never ported to the more
recent Python versions. Michael told me that this is a package
he is no longer too fond of, since it was written in his early days.
He also told me that he is not keen on supporting it so much,
because he would be tempted to give it a whole rewrite.
Now, I'm thinking differently. I got the package to work, after
about 12 hours of hacking, and it simply works. Since I didn't
write the initial version, I have a different relationship to it.
In other words: It is easier to maintain a foreign package than
your own, since you are not married with it.
Why do I dig into foreign areas?
Well, I have enough work with my Stackless package.
Stackless is almost ready. (Almost, like your toy
railway gets really ready; it really never will.)
I just built a minimum Psyco support into it, because
I'm basically always after as much speed as I can get.
But there are limits with the regular Python interpreter.
So my idea is to use these crazy other projects to get more
performance, and to support them directly.
My first idea to accelerate Psyco using Stackless was
to provide Stackless with extra hardware stacks, which can
be switched at light-speed. I still have this idea in mind,
but the implementation is not so trivial.
Comparatively, replacing generators (yield calls) with
a couple of save/restores of tuples *is* almost trivial,
as I'm probably going to show tomorrow.
In Python, these "fake-generators" would be reasonably
But, by the fact that these are then Psyco-enabled, makes
them really, really fast, and also completely inlineable.
I think to name the module "renegate". :-)
Why do I want to revive this package
Well, I am a pragmatic guy, and I have a really good reason why I
need the bytecodehacks. I am writing a sophisticated package
which involves parsing of PDF files, and I want to do it all in
Python. In order to get this PDF processor to almost C speed,
I used Armin Rigo's wonderful Psyco package.
Unfortunately, Psyco has a few limitations, which act as a
- generators are not supported. That means, whenever I use
a generator, Psyco will not accelerate it, but will act as a
- Psyco is great at optimizing simple structures like lists, tuples,
numbers and strings. It is less able to enhance things like
object properties. Using self frequently disables almost all
of Psyco's capabilities.
- Psyco has difficulties with inlining. Simple functions *are*
inlined, but when they contain a conditional branch or they
exceed some limit, inlining is disabled. This *could* be changed,
but with a lot of effort by changing C code. This is not going to
happen, because all of this stuff will be enhanced and
re-implemented during the PyPy project.
Now, by combining the re-animated bytecodehacks project with
Psyco, I am almost sure that I can remove certain restrictions
from Psyco, by turning problematic Python structures into simple
ones, which the current Psyco can handle natively.
Poor man's PyPy
Although I am a member of the PyPy project, and I do belong to the
people who initiated the PyPy project, I am impatient, and I want
to get a few of the expected PyPy results right now. Psyco is
phantastic but not perfect, and it needs some help to gather maximum
By adding Bytecodehacks in the right manner, I think I can fill
this gap. With BCH, I can replace generators by ordinary methods
of a class (plus a few bytecode instructions which have no real
Python equivalent, like goto). By inspecting the data flow of a
self.attribute, I can prove that it is invisible outside and
replace it by a simple local variable in many cases. By using
Bytecodehacks for proper inlining of functions, I can deliver
Psyco from this difficult task.
The expected result
By consequently applying the methods I sketched above,
I expect that I can make almost every existing application
reasonably faster. I will provide this as a service for
customers and charge them for relevant acceleration.
The software will stay open-sourced. This is just a few
add-ons to Psyco and Bytecodehacks, and I'm not the author
of these. I just found out how nicely they can fit together.
My guess is an overall acceleration of at least a factor of
five for almost any native Python application. There is no proof
yet, this is all Vodoo from my stomach. But this stomach tends
to be quite reliable
Mike, please forgive me this announcement. You should have
written it, but I was so very inspired.
Getting the bytecodehacks for Python 2.3
The current source code is available at
cvs -d:pserver:anonymous at cvs.sourceforge.net:/cvsroot/bytecodehacks login
cvs -z3 -d:pserver:anonymous at cvs.sourceforge.net:/cvsroot/bytecodehacks
cheers -- chris
Christian Tismer :^) <mailto:tismer at stackless.com>
Mission Impossible 5oftware : Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/
14109 Berlin : PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34 home +49 30 802 86 56 mobile +49 173 24 18 776
PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04
whom do you want to sponsor today? http://www.stackless.com/
More information about the Pypy-dev