[Python-Dev] use co_flags to identify instruction set

Skip Montanaro skip@pobox.com (Skip Montanaro)
Tue, 7 Aug 2001 23:36:06 -0500


All this talk of common backends for Python, Perl, and Ruby goaded me into
revisiting the Rattlesnake stuff I laid aside several years ago.  Assuming I
ever get anywhere with it, it would be nice if code objects could
distinguish instruction sets based on a flag in the PyCodeObject struct.
The co_flags field seems to have some room and be more-or-less the right
place for this stuff.  I propose that it be changed from signed to unsigned
int and that three bits be reserved to identify an instruction set.  Eight
possible instruction sets might seem a bit much, but I'd rather have a
little room for growth.  If we count the current instruction set, the
Rattlesnake (register) stuff I've been playing with, and Armin Rigo's Psyco
VM as distinct instruction sets, we've already used three of the possible
eight.  I'm still fiddling with a 1.5.2 code base and am currently only
using one bit in co_flags to distinguish the instruction set, but I do use
it to indirect through a two-element array of function pointers and call the
appropriate variant of eval_code2 (now eval_frame).

Just to whet peoples' appetites a bit...

I'm struggling to get conditional opcodes working at the moment, but have
had pretty good success eliding unnecessary loads and stores in straight
blocks of code.  Given this trivial function:

    def f(a):
      b = a + 4
      c = a + b
      return c

The Rattlesnake optimizer can convert it from

    >>    0 LOAD_FAST           0 (a)
          3 LOAD_CONST          1 (4)
          6 BINARY_ADD     
          7 STORE_FAST          1 (b)
         10 LOAD_FAST           0 (a)
         13 LOAD_FAST           1 (b)
         16 BINARY_ADD     
         17 STORE_FAST          2 (c)
         20 LOAD_FAST           2 (c)
         23 RETURN_VALUE   

to

    >>    0 (0336) LOAD_CONST_REG             %r4, 4
          3 (0077) BINARY_ADD_REG          b, a, %r4
          7 (0077) BINARY_ADD_REG            c, a, b
         11 (0075) RETURN_VALUE_REG                c

Needless to say, I expect it to run a bit faster than the original code.

Rattlesnake takes advantage of a property of frame objects that Tim pointed
out to me a long time ago, namely that the frame's locals and its temporary
stack space are contiguous and can just be treated as a single register
file.  In fact, the code above can just as easily be written as

    >>    0 (0336) LOAD_CONST_REG                 %r3, 4
          3 (0077) BINARY_ADD_REG          %r1, %r0, %r3
          7 (0077) BINARY_ADD_REG          %r2, %r0, %r1
         11 (0075) RETURN_VALUE_REG                  %r2

Skip