[Python-Dev] Re: native code compiler? (or, OCaml vs. Python)

Guido van Rossum guido@python.org
Mon, 03 Feb 2003 15:37:04 -0500


> As Guido mentioned, inline builtins.  I have a real simple patch
> which does this.  ie, LOAD_GLOBAL(name) -> LOAD_CONST(builtin_function).
> However, there is a problem with the patch: the resulting code
> can't be marshalled properly.  I haven't tried to fix that yet.

That's because you can't marshal references to built-in function
objects.

I was thinking of adding appropriate new opcodes for a few builtins
that are called a lot, like len.  This would be implemented using
something like this:

		case BUILTIN_LEN:
		    {
                        long n;
			v = POP();
			n = PyObject_Size(v);
			Py_DECREF(v);
			if (n >= 0) {
				x = PyInt_FromLong(n);
				if (x != NULL) {
					PUSH(x);
					continue;
				}
			}
			else {
				err = n;
			}
			break;
		    }

> Right now JUMP_IF_(TRUE, FALSE) keep their computed value on the
> stack.  They are always followed by POP_TOP.  If the JUMP_IF_* removed
> the value, it would be one less trip through eval_frame loop (no
> POP_TOP).  I've got a patch for this which fixes 5 of the 8 cases
> where JUMP_IF_* are generated.  The problem with the remaining 3 cases
> is that a jump to a jump occurs.  By removing the jump to a jump, that
> should also help performance.

Yes.

I think there's also an inefficiency in the bytecode generated for
'not': instead of generating a UNARY_NOT opcode, it could switch the
sense of the test (changing JUMP_IF_FALSE into JUMP_IF_TRUE and vice
versa).

> I have a crazy idea that removing the switch and making our own
> jump table in the eval_frame loop could improve performance.
> But I've never tried this because it's a lot of work.  And it
> could hurt, not help performance.

The only way to find out is to try it. ;-)

--Guido van Rossum (home page: http://www.python.org/~guido/)