[Python-Dev] Why is Bytecode the way it is?

Thu Jul 8 15:29:16 CEST 2004

Michael Hudson wrote:

>...
> 
> Well, you'd have to have RETURN_VAR *as well* as RETURN_VALUE, or in
> code like 
> 
>     return a + b
> 
> you'd have to create a temporary variable and store to it.
> 
> We have a limit of 256 opcodes...

First, I don't see anything wrong with temporary variables...you need to 
keep track of how many you use I suppose, so the compiler needs to be a 
little bit smarter.

Second, some opcodes seem wasted to my naive eyes. e.g. PRINT_NEWLINE or 
all of the in-place variants of mathematical operators.

Third, you yourself came up with a hack that would allow the same opcode 
to work on variables or the stack using "-1" as the variable index.

>>Similarly, what if BINARY_ADD could work directly on constants and
>>variables? I see the virtue of using the stack for objects that do not
>>otherwise have a name. But if a value is in a contant or variable, why
>>not refer to it by its position in co_consts or co_varnames.
> 
> How would you implement this?  Give BINARY_ADD two arguments
> (something no bytecode has now, btw) and treat '-1' as 'pop from the
> stack'?  This sounds obfuscatory.

If there is anywhere in the Python implementation where you trade some 
"readability" for some performance (and surely there are!) then wouldn't 
the bytecode be the place? I mean bytecodes are *byte* *codes*. They 
aren't tuples of pointers to nice pretty objects. They are a list of 
bytes that are Python's equivalent to assembly language.

>>And as long as we are talking about referring to things more directly,
>>wouldn't it be possible to refer to constants by pointer rather than
>>indirecting through the index? You'd have to fix up pointers when you
>>first loaded the code but only once per function.
> 
> Could do.  Opcode arguments are only 16 bits though, unless you use
> the EXTENDED_ARG thingy, and then they're only 32 bits: what about 64
> bit platforms?

You would have to extend the bytecode format.

> Python's VM is currently a stack machine.  There are arguments for
> making it a register machine, but if we want to do that, lets go the
> whole hog and not have some kind of half-assed hybrid.

I'm not really talking about a register machine either. I don't 
understand why you would want to copy values from a heap in "main 
memory" into a register *still in main memory* to have the bytecodes 
operate on them to store to a register and then back to main memory.

Perhaps we take the CPU analogy too far. Or perhaps there is something 
deep I misunderstand.

  Paul Prescod