[Python-Dev] optimizing non-local object access

Jeff Epler jepler@inetnebr.com
Thu, 9 Aug 2001 15:05:51 -0500

On Thu, Aug 09, 2001 at 01:05:28PM -0500, Skip Montanaro wrote:
> Yeah, but all you've really done is extract the current inline test for int
> arguments into a few more PyVM instructions.  You've sacrificed a fairly
> fast special case test for more passes around the interpreter loop.  I'm not
> sure how this can be a win in the current virtual machine.

The real gain comes when you can hoist the type test out of the loop
(though that means proving the type condition is an invariant of the

For instance, in the following, the type of x is invariant over the
loop, and the test can be moved entirely outside the loop.  Now, the
loop itself can operate on ints, and hopefully benefit from it (in the
most extreme case, the loop can become about 2 CPU (not VM) instructions
per line, given even a fairly rudimentary JIT.  Add a bit more to catch
the overflow in the 'x*3+1' expression (or even the 'r+1' expression),
whether to raise ValueError or silently become a long integer and bail
back to the slow version.

def f(x):
    """ The famous "hailstone" problem, assuming my memory is right ---
        nobody seems to know how to predict #iterations or highest
	intermediate value of x from the initial value, or has even
	proven that the algorithm completes for all values of x """

    r = 0
    while x > 1:
        r = r + 1
        if x%2 == 0:
            x=x/2   # Truncating integer division
    return r        

def f(x):
    if isinstance(x, int):
        """run fully native code with x and r known to be platform ints"""
        """run python bytecode"""

The "hard" parts, as I see them, are detecting/deducing these type
invariants, even given hints from the programmer about the types of
parameters (a la 'f = optimize(f, (int,))'), and having a way to 
"bail" to the original bytecode version.  You could have
	if isinstance(x, int):
			""" fully native version """
		except ValueError: pass
	""" run python bytecode if there was no optimized version,
	    or this x happened to overflow platform int """
but that would not work if the code makes calls that have side-effects.
(mutate objects, produce output, ...)  Transmeta has patented their
methods to do this, in their dynamic binary translating CPUs (not that
their approaches are likely to have much relevance for the Python VM).