Three-operand bytecode

Sat Sep 16 12:20:57 EDT 2000

Has anyone ever considered implementing a non-stack bytecode version
of Python?  Not to be confused with "stackless python", this would be
a system where Python bytecode workes more like a modern processor by
accessing objects through registers rather than via the Python stack.
A few recent posts on comp.compilers have implied that the elimintation
of stack operations in the bytecode interpreter can provide a substantial
benefit in execution speed.

Is there any reason to believe that this would or would not be true of Python?

The scheme would have 32-bit bytecodes, 16-bit indices into the
global/local/attribute arrays, and 256 "registers":

8		8	16
LOAD_xxx	DREG	N	# xxx = FAST, GLOBAL, ...
STORE_xxx	SREG	N	

8		8	8	8
UNARY_xxx	DREG	SREG	0

8		8	8	8
BINARY_XXX	DREG	SREG1	SREG2

The obvious problems I've seen are with the opcodes such as BUILD_TUPLE,
CALL_FUNCTION, and others whose argument is actually a number of stack items
to be consumed by the operation.  An approach which adds a new STORE_xxx
came to mind---call it STORE_STAGING:
	LOAD_IMM	r1,	1
	STORE_STAGING	r1,	0
	LOAD_IMM	r1,	2
	STORE_STAGING	r1,	1
	BUILD_TUPLE	r1,	2
The above code would be equivalent to the python code '(1,2)'

However, this won't wouldn't work for nested tuples such as '(1,(2,3))'
with a naive translation approach.

I also just realized that LOAD_ATTR and STORE_ATTR don't quite fit into
this scheme, since they have two and three 16-bit inputs respectively.
The sequence
	LOAD_FAST	obj
	LOAD_ATTR	"attr"
could be
	LOAD_FAST	r1, obj
	LOAD_ATTR	r1, "attr"   # r1 is used as a source and a result
but STORE_ATTR can't be rewritten in a similar way...