I could not resist to do this announcement, although this project belongs to Michael Hudson. Mike, please forgive me. I owe you a beer.
In a single day-and-night session, I hacked against bytecodehacks, to upgrade it from Python 1.5.2 to Python 2.3 . This was quite some work, although not so very much, due to the excellent basic layout which Michael created.
What are the Bytecodehacks? ---------------------------
The bytecodehacks allow you to do certain modifications to compiled Python code objects. There are lots of applications included, like macro expansions and function inlining, things which Python does not provide and also is not supposed to provide. This kind of madness exists for people who ask for it, without worring those who don't care.
Again, this great package was written by Michael Hudson in early 2000, and to my knowledge, it was never ported to the more recent Python versions. Michael told me that this is a package he is no longer too fond of, since it was written in his early days. He also told me that he is not keen on supporting it so much, because he would be tempted to give it a whole rewrite.
Now, I'm thinking differently. I got the package to work, after about 12 hours of hacking, and it simply works. Since I didn't write the initial version, I have a different relationship to it. In other words: It is easier to maintain a foreign package than your own, since you are not married with it.
Why do I dig into foreign areas? --------------------------------
Well, I have enough work with my Stackless package. Stackless is almost ready. (Almost, like your toy railway gets really ready; it really never will.) I just built a minimum Psyco support into it, because I'm basically always after as much speed as I can get. But there are limits with the regular Python interpreter.
So my idea is to use these crazy other projects to get more performance, and to support them directly. My first idea to accelerate Psyco using Stackless was to provide Stackless with extra hardware stacks, which can be switched at light-speed. I still have this idea in mind, but the implementation is not so trivial.
Comparatively, replacing generators (yield calls) with a couple of save/restores of tuples *is* almost trivial, as I'm probably going to show tomorrow. In Python, these "fake-generators" would be reasonably slower. But, by the fact that these are then Psyco-enabled, makes them really, really fast, and also completely inlineable. I think to name the module "renegate". :-)
Why do I want to revive this package ------------------------------------ Well, I am a pragmatic guy, and I have a really good reason why I need the bytecodehacks. I am writing a sophisticated package which involves parsing of PDF files, and I want to do it all in Python. In order to get this PDF processor to almost C speed, I used Armin Rigo's wonderful Psyco package. Unfortunately, Psyco has a few limitations, which act as a show-stopper:
- generators are not supported. That means, whenever I use a generator, Psyco will not accelerate it, but will act as a small slow-down.
- Psyco is great at optimizing simple structures like lists, tuples, numbers and strings. It is less able to enhance things like object properties. Using self frequently disables almost all of Psyco's capabilities.
- Psyco has difficulties with inlining. Simple functions *are* inlined, but when they contain a conditional branch or they exceed some limit, inlining is disabled. This *could* be changed, but with a lot of effort by changing C code. This is not going to happen, because all of this stuff will be enhanced and re-implemented during the PyPy project.
Now, by combining the re-animated bytecodehacks project with Psyco, I am almost sure that I can remove certain restrictions from Psyco, by turning problematic Python structures into simple ones, which the current Psyco can handle natively.
Poor man's PyPy ---------------
Although I am a member of the PyPy project, and I do belong to the people who initiated the PyPy project, I am impatient, and I want to get a few of the expected PyPy results right now. Psyco is phantastic but not perfect, and it needs some help to gather maximum performance. By adding Bytecodehacks in the right manner, I think I can fill this gap. With BCH, I can replace generators by ordinary methods of a class (plus a few bytecode instructions which have no real Python equivalent, like goto). By inspecting the data flow of a self.attribute, I can prove that it is invisible outside and replace it by a simple local variable in many cases. By using Bytecodehacks for proper inlining of functions, I can deliver Psyco from this difficult task.
The expected result -------------------
By consequently applying the methods I sketched above, I expect that I can make almost every existing application reasonably faster. I will provide this as a service for customers and charge them for relevant acceleration. The software will stay open-sourced. This is just a few add-ons to Psyco and Bytecodehacks, and I'm not the author of these. I just found out how nicely they can fit together.
My guess is an overall acceleration of at least a factor of five for almost any native Python application. There is no proof yet, this is all Vodoo from my stomach. But this stomach tends to be quite reliable
Mike, please forgive me this announcement. You should have written it, but I was so very inspired.
Getting the bytecodehacks for Python 2.3 ----------------------------------------
The current source code is available at
cvs -d:pserver:email@example.com:/cvsroot/bytecodehacks login
cvs -z3 -d:pserver:firstname.lastname@example.org:/cvsroot/bytecodehacks co bytecodehacks
cheers -- chris