[Python-Dev] Bytecode analysis

damien morton dmorton@bitfurnace.com
Tue, 25 Feb 2003 18:17:22 -0500


> From: Skip Montanaro [mailto:skip@pobox.com] 
> 
>     Damien> Ive done a static analysis of the bytecodes from 
> compiling the
>     Damien> python standard library:
> 
>     ...
> 
> Nice input, but I think you should deal with dynamic opcode 
> frequencies. See the #define's in the ceval.c source for how 
> to enable that and the recent thread in c.l.py.  In short, it 
> doesn't really matter how many LOAD_CONST instructions appear 
> in the bytecode stream, only how many of them are executed.  
> While LOAD_CONST is a relatively frequently executed opcode, 
> if I recall correctly, LOAD_FAST is much more frequently 
> executed, and LOAD_CONST is already pretty fast, so adding 
> special opcodes probably won't buy you much and runs the risk 
> of disturbing what locality of reference exists.

I will indeed take a look at dynamic opcode frequencies. This is my
first stab at it, though.

I had originaly taken a look at this from a code compression
perspective. The proposals I made should result in a non-trival
reduction in the size of the compiled bytecodes. This, in turn should
result in _some_ speedup.

As you say, LOAD_FAST is a very frequently occuring instruction, both
statically and dynamically. Reducing it from a 3 byte instruction to a 1
byte instruction in 97% of (static) cases should be an overall good.

Most of the opcodes I proposed could be added without disturbing
locality of reference.

e.g.

switch (op = *p++) {
   ...
   case LOAD_FAST:
      index = (*p++) + (*p++)<<8
      goto LOAD_FAST_MAIN;
	break;
   case LOAD_FAST_0:
   case LOAD_FAST_1:
   case LOAD_FAST_15:
      index = op - LOAD_FAST_0
      LOAD_FAST_MAIN:
	...
      break;

       
}

> The LOAD_IF_FALSE issue was discussed recently (here I 
> think).  The problem is that chained operations like
> 
>     if a < b < c:
>         ...
> 
> require the value be retained.  You could get rid of the 
> separate POP_TOP instruction but it would require some work 
> in the compiler.

Those kind of chained operations are by far the minority (20% of static
cases). They can be handled by a DUP instruction before the JUMP opcode,
or a separate non-consuming JUMP opcode can be created.

> as-always-patches-are-welcome-ly, y'rs,
> 
> Skip
> 

not-quite-ready-to-push-my-hands-in-up-to-the-elbows-ly, yours

Damien