Ann: Revival of the BytecodeHacks

Christian Tismer tismer at stackless.com
Fri Jul 30 03:08:49 CEST 2004


Dear community,

I could not resist to do this announcement, although
this project belongs to Michael Hudson.
Mike, please forgive me. I owe you a beer.

In a single day-and-night session, I hacked against bytecodehacks,
to upgrade it from Python 1.5.2 to Python 2.3 .
This was quite some work, although not so very much, due to
the excellent basic layout which Michael created.


What are the Bytecodehacks?
---------------------------

The bytecodehacks allow you to do certain modifications to
compiled Python code objects. There are lots of applications
included, like macro expansions and function inlining,
things which Python does not provide and also is not supposed
to provide. This kind of madness exists for people who ask
for it, without worring those who don't care.

Again, this great package was written by Michael Hudson in early
2000, and to my knowledge, it was never ported to the more
recent Python versions. Michael told me that this is a package
he is no longer too fond of, since it was written in his early days.
He also told me that he is not keen on supporting it so much,
because he would be tempted to give it a whole rewrite.

Now, I'm thinking differently. I got the package to work, after
about 12 hours of hacking, and it simply works. Since I didn't
write the initial version, I have a different relationship to it.
In other words: It is easier to maintain a foreign package than
your own, since you are not married with it.


Why do I dig into foreign areas?
--------------------------------

Well, I have enough work with my Stackless package.
Stackless is almost ready. (Almost, like your toy
railway gets really ready; it really never will.)
I just built a minimum Psyco support into it, because
I'm basically always after as much speed as I can get.
But there are limits with the regular Python interpreter.

So my idea is to use these crazy other projects to get more
performance, and to support them directly.
My first idea to accelerate Psyco using Stackless was
to provide Stackless with extra hardware stacks, which can
be switched at light-speed. I still have this idea in mind,
but the implementation is not so trivial.

Comparatively, replacing generators (yield calls) with
a couple of save/restores of tuples *is* almost trivial,
as I'm probably going to show tomorrow.
In Python, these "fake-generators" would be reasonably
slower.
But, by the fact that these are then Psyco-enabled, makes
them really, really fast, and also completely inlineable.
I think to name the module "renegate". :-)


Why do I want to revive this package
------------------------------------
Well, I am a pragmatic guy, and I have a really good reason why I
need the bytecodehacks. I am writing a sophisticated package
which involves parsing of PDF files, and I want to do it all in
Python. In order to get this PDF processor to almost C speed,
I used Armin Rigo's wonderful Psyco package.
Unfortunately, Psyco has a few limitations, which act as a
show-stopper:

- generators are not supported. That means, whenever I use
   a generator, Psyco will not accelerate it, but will act as a
   small slow-down.

- Psyco is great at optimizing simple structures like lists, tuples,
   numbers and strings. It is less able to enhance things like
   object properties. Using self frequently disables almost all
   of Psyco's capabilities.

- Psyco has difficulties with inlining. Simple functions *are*
   inlined, but when they contain a conditional branch or they
   exceed some limit, inlining is disabled. This *could* be changed,
   but with a lot of effort by changing C code. This is not going to
   happen, because all of this stuff will be enhanced and
   re-implemented during the PyPy project.

Now, by combining the re-animated bytecodehacks project with
Psyco, I am almost sure that I can remove certain restrictions
from Psyco, by turning problematic Python structures into simple
ones, which the current Psyco can handle natively.


Poor man's PyPy
---------------

Although I am a member of the PyPy project, and I do belong to the
people who initiated the PyPy project, I am impatient, and I want
to get a few of the expected PyPy results right now. Psyco is
phantastic but not perfect, and it needs some help to gather maximum 
performance.
By adding Bytecodehacks in the right manner, I think I can fill
this gap. With BCH, I can replace generators by ordinary methods
of a class (plus a few bytecode instructions which have no real
Python equivalent, like goto). By inspecting the data flow of a
self.attribute, I can prove that it is invisible outside and
replace it by a simple local variable in many cases. By using
Bytecodehacks for proper inlining of functions, I can deliver
Psyco from this difficult task.


The expected result
-------------------

By consequently applying the methods I sketched above,
I expect that I can make almost every existing application
reasonably faster. I will provide this as a service for
customers and charge them for relevant acceleration.
The software will stay open-sourced. This is just a few
add-ons to Psyco and Bytecodehacks, and I'm not the author
of these. I just found out how nicely they can fit together.

My guess is an overall acceleration of at least a factor of
five for almost any native Python application. There is no proof
yet, this is all Vodoo from my stomach. But this stomach tends
to be quite reliable

Mike, please forgive me this announcement. You should have
written it, but I was so very inspired.


Getting the bytecodehacks for Python 2.3
----------------------------------------

The current source code is available at

cvs -d:pserver:anonymous at cvs.sourceforge.net:/cvsroot/bytecodehacks login

cvs -z3 -d:pserver:anonymous at cvs.sourceforge.net:/cvsroot/bytecodehacks 
co bytecodehacks

cheers -- chris
-- 
Christian Tismer             :^)   <mailto:tismer at stackless.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  mobile +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/




More information about the Python-announce-list mailing list