Bytecode optimisation

Mon May 24 20:14:03 EDT 1999

On Sun, 23 May 1999 13:30:36 GMT, Christian Tismer wrote:
>William Tanksley wrote:

>> But _why_?  If I become really interested in speed, I don't want to slave
>> away making mmy beautiful Python code look like C.  I would rather write
>> in C.

>> And I would rather have _less_ incentive to write in C, and zero incentive
>> to write baroque Python (for speed).  An autommated optimizer certainly
>> could handle this.

>I agree. But obvious optimizations are not my goal. A lot of
>things can of course be done on the opcode level.
>My point is that this whole area is bound to an upper
>limit which I know.

Okay.  I'm not harping on you for not wanting to _do_ these things (far
from it; I'm not doing those things).  I'm merely trying to reiterate a
truth about open source projects I myself just recently realized: if
someone else suggests a stupid but cool idea, and then someone else
implements it and it works, then it was _never_ stupid.

Because of this, I believe that rationing other people's work by saying
"that's too hard" is not a good idea, and will backfire.  I'm running a
minor opensource project according to this rule, and so far I'm very happy
with the results -- I've had more code contributed than I myself have
written.  I've also had dumb ideas suggested, and following my own advice
I've placed them on my webpage.

In fact, I've even had one dumb idea get suggested, implemented -- and now
I'm going to add it officially because it turns out to be a good idea
(admittedly, I became convinced that it was good before someone
implemented it, but still...).

>> >The loop opcode will anyway continue to call a sequence_getitem,
>> >wether it knows the type or not.

>> Ah, another place for optimization, eh?

>The real key. This is where the real game starts.

_Certainly_ a lot of fun -- especially since such an optimization has a
potential for being easier to implement, in the sense that there are fewer
cases where making the optimization will make the system behave wrong.

>> >Guys, you are at the wrong end. This is another complete waste
>> >of time. The whole interpreter overhead is bound to an average
>> >of about 30 percent, when everything is C-code. I'd wonder if
>> >bytecode optimization would give more than 15 percent on
>> >average, when semantics are not changed.

>> 15% is a KILLER speedup for optimization without algorithm changes.
>> Personally, I would consider that a vindication -- but anything less is
>> not obviously a loss.

>I disagree. From Python 1.4 to 1.5, there was a 50 percent
>speed increase. This was worth it.

>15% is a nice result. But, I doubt it would cause
>too many users up upgrade their installation.
>Even after I had tweaked eval_code to run 11% faster
>under Windows, do you think I use it?
>Not wort enough to become incompatible to the
>mainstream.

This last sentance contains ALL of your meaning.  "Not worth
incompatibility."  Let's just put it this way: if an optimization causes
incompatibility, there was a bug involved.  Later versions of Python (such
as 2.0) will start improving performance by removing slow features; this
will break compatibility, but the excuse is that the old versions were
misdesigned.

C++ contains an optimization which can break compatibility (non-virtual
functions).

If you've tweaked eval_code to be 11% faster, and it truly is faster, then
we need to consider whether making that tweak general is worth it.  Will
making it portable remove the speed gain?  Will it reduce compatibility in
some way?  If it does, might it be worth putting in 2.0 anyhow?  How big
is it?

>> This is one reasonn why I brought up token-threading.  Think of it as
>> 16-bit "byte"codes.  16K instruction space...  or mmaybe even 4G.  Enough
>> to assign a code to every user function in a program (for use only whenn
>> we're certain, of course).

>I'm not going to build static, huge monsters.
>Optimization doesn't mean speed alone. There must
>be a weighted harmony between speed and size.
>You can't achive this statically.

You seem to believe that you're saying something connected to what I just
said.  You're not.  Token threading doesn't take up appreciably more space
in the general case than bytecoding.  In some common cases it can take less.

I'm not at all sure what you mean by talking about "static".

>> And Anton's finally published his stack-code optimizer at
>> http://www.complang.tuwien.ac.at/projects/rafts.html.  It's not
>> complete, and he hasn't published the source yet (but it's based on a
>> GPLed product), but already it compiles Forth to as much as 80% of C speed
>> (this is a special case, though).

>Interesting. What is the special case? Forth, or the program
>which he can make so fast?

Yes, both.  gForth (GNU's Forth) is the program he can make "so fast"
(he's actually discouraged that it's only 80% of gcc).

The reason he's not happy is that gForth, even when compiled, results in a
substantial amount of threaded code, so neither the C compiler nor the
Forth optimizer can approach their maximum speed.  Well, at least it's
know to produce *some* optimization :).

In order to produce some better figures, he'd have to rewite a major Forth
program in C, or vice versa.  Not highly interesting :-).  He's occupied
finding better optimizations, such as inlining (Forth typically uses very
small blocks, which makes almost all block-specific optimizations
useless).

Of course, Forth and Python are opposites.  Forth procedure calls are very
fast, almost no overhead; Python procedure calls are enormously slow.
Forht programmers go out of their way to factor; Python programmers are
driven out of their minds inlining ;-).

>> >no-longer-optimizing-for-less-than-50-percent-ly y'rs - chris

>> That's not a wrong attitude.  You'll accept no lesser result than
>> innovation.  However, incremental gains are also valuable, and you
>> shouldn't reject or even discourage those who attempt to make them.

>This was not my intent. I just wanted to point out that I see
>limited chances of gains there, and I'm more looking into
>what has to be done to get special cases *very* fast.
>I'm happy to leave the 15 percent area to others, since
>it will happen anyway, with or without me. 

Ah, that makes sense.  I had misunderstood.

>Just making sure that you will
>not be disappointed, you cannot await much more than 
>what I estimated.

Good advice.

>good luck - chris

And to you.

-- 
-William "Billy" Tanksley
"But you shall not escape my iambics."
           -- Gaius Valerius Catullus