Bytecode optimisation
Michael Hudson
mwh21 at cam.ac.uk
Tue May 18 04:13:53 EDT 1999
cwebster at math.tamu.edu (Corran Webster) writes:
> I've been mucking around with the bytecodehacks modules of Michael Hudson
> this past weekend, and have used it to hack up an ad hoc bytecode
> optimiser. I'm posting this to stimulate a bit of interest from others,
> since I have zero experience in building optimising compilers - my knowledge
> is all gathered from reading the dragon book.
I'm impressed (and slightly scared - people are actually *using* my
code...).
> This is proof-of-concept stuff at present, I intend a complete
> re-write, and this code almost certainly is incorrect - it has not been
> seriously tested. Use it at your own risk.
Well, that's what I thought the status of bytecodehacks was...
> However, it can:
> * precalculate constants: 8 + 4 gets replaced by 12.
> * remove redundant code: strip SET_LINENO bytecodes and unreachable code.
> * simplify jumps-to-jumps and jumps-to-breaks.
Have you looked at Skip Montanaro's optimizer:
http://www.automatrix.com/~skip/python/spam7/optimizer.html
This does some similar work. Unfortunately is distributed as a patch
against Python 1.4 sources, so I had to hack at it with sed to get the
bits I was interested in out.
> The morals of this hack are:
>
> * Michael's bytecode hacks _enormously_ simplify this sort of thing
> because they track jump destinations, and allow easy editing of code
> objects.
>
> * There's considerable room for peephole type optimization.
>
> * Optimisation is probably only worthwhile when speed is essential.
Isn't this a tautology? Repest after me "Premature optimization is
..."
> * More complex optimisation is certainly possible, although a lot more
> work would be required. I've made a first stab at block-level analysis
> and it seems feasible. In particular it would be nice to detect and
> shift loop invariant code.
The problem is you can't, at least not without potentially changing
the meaning of the code.
You might think
while c < 10:
c = c + a + b
is equivalent to:
d = a + b
while c < 10:
c = c + d
but there's not the least guarantee that a + b return sthe same value
each time.
I was thinking that maybe you could `tag' variables as numeric or
commutative or whatever like so:
def func(x):
numeric(x)
c = 0
while c < 1:
c = c + (x*x - 2) # could be hoisted
Then a bytecodehacks optimizer would strip the tags out but could use
the information so gleaned to perform more sophisticated
optimizations.
> * If type information could be guessed, then more might be able to be
> squeezed. At the moment I can't simplify something like 0 * x,
> because x could be anything, and could call an arbitrary function to
> perform the operation. If it was known that x was an integer, then we
> could safely replace it with 0. Assert statements could be used for
> this sort of thing.
This sort of consideration is the fundamental problem with optimizing
Python.
> * If you really need to make your code run faster, you're still better
> off squeezing the Python source; if that doesn't work, then you
> probably want to re-write in C. However, there is hope for us lazy
> coders in the future.
>
> I must warn again that this code is ugly-mindbending-probably-wrong-pre-
> -alpha-use-it-at-your-own-risk-has-to-be-rewritten-evil-nasty stuff.
>
> But-it-is-fun-to-play-with-ly yours,
> Corran
>
if-you-get-better-than-twenty-percent-speed-up-I'll-be-impressed-ly y'rs
Michael
More information about the Python-list
mailing list