[pypy-dev] Object model

VanL vlindberg at verio.net
Thu Feb 13 18:48:46 CET 2003

I am  trying to make sure I understand this correctly...  Please bear 
with me if am am being slow.  I come from more of a hardware than 
software background, so I am just tring to apply the concepts to what I 

I think you are envisioning a model interpreter that works (more or 
less) like the CPU works in a computer.  

 CPUs, especially RISCy ones, usually work with essentially three things:

1. Operators (like iadd, fadd, imul, fdiv, or, xor, etc)
2. Registers
3. Immediate values

In normal operation, the CPU has no idea what the values in two 
registers refer to, all it knows is that it has been told to add the 
contents of register A with register B and store the result in register C.

I see this as analogous to your "black box" concept:  The 
registers/pyobjects are the black boxes and all the main loop has to do 
is stuff the black boxes into the correct functional units as fast as it 

The immediate values of a CPU would be analogous to your unwrapped 
objects -- if I understand, you would wrap them in a black box before 
stuffing them into the functional units for purposes of making the 
functional units more specialized.

If this is a correct interpretation, then that would suggest that a lot 
of the traditional "hardware" optimization techniques might be able to 
be used.  However, the compiler would also need to be pretty smart to 
correctly order instructions.

It also might suggest a bootstrapping procedure:  

1. Start with the bytecode, rather than the compiler.  Treat the byte 
code as "machine code" that we execute and CPython as the compiler.
2.  Put together the load and dispatch logic in python.  There may need 
to be a function decomposition step, where "big" bytecode instructions 
are split apart into simpler, single-step functions. Create the 
"functional units" so that initially they use CPython to actually 
perform the operation.
3. Port over the functional units so that they create and execute 
machine code, rather than relying on CPython to do the work.  When all 
the functional units are ported over, PyPy v 1.0 is complete.

This also would allow things like trace caches and conditional 
execution.... Using a register-based python vm might also speed the way 
to a native-code compiler.  However, I thing that this could be made 
quite fast.

Once again, pardon me if this is a wild misinterpretation of what you 
were saying; as I said, I am a hardware guy who is just starting to get 
into compiler theory.


Armin Rigo wrote:

>Hello everybody,
>I have been experimenting a bit about that distinction I tried to draw between
>"application-level" and "interpreter-level" objects.  I still think the idea
>to be essential but I "reorganized" my thoughts a little bit.
>Now I believe in following the idea that CPython's main loop does not assume
>anything at all about the PyObjects it manipulates, but does every operation
>via calls to library functions like PyNumber_Add().  All it does is moving
>PyObjects around, mainly between the stack and the locals.  Similarily, our
>own main loop should not assume anything about the objects it stores in the
>stack and the locals, not even that they have some well-known methods.  
>Instead, we would provide it with the equivalent of PyNumber_Add() & co.
>Now the fun part of it is that there is no reason to use only one fixed set of
>functions to do the operations.  There are several things we want to be able
>to do with our interpreter, and these will probably correspond to different
>ways of "seeing" what an object is.  We might try several ways to implement
>the objects, for example trying different implementations for what the
>application will always see as a list.  And even, we might want to have
>several concurrent "object spaces" existing at the same time, e.g. to
>implement the idea of multiple VMs in the same process.
>Thus it seems natural to define a class ObjectSpace, that defines methods like
>PyNumber_Add() and PySequence_GetItem().  Each frame would have an
>''objectspace'' attribute that holds the space in which it runs.  (Maybe
>shorter, more Pythonic names are better, e.g. add() and getitem().)
>Thus an object space is:
> * a way to implement objects;
> * from the interpreter's point of view, it is the "library" needed to do any
> * from the application's point of view, it is essentially invisible.
> * add(x,y), getitem(x,y), etc. for the operations, that take two "objects" 
>(black boxes for the interpreter) and returns an "object" (black box too).
> * type(x), taking a black box object and returning a black box object as 
> * wrap(a), taking a *normal* object and putting it into a black box.  Its
>what the interpreter does for a LOAD_CONST, for example: it has a real Python
>object and wants to send it to the object space.  For example, if the object
>space has a custom implementation "class MyList" for lists, then wrap([1,2,3])
>should create a MyList instance.
> * unwrap(x) is the inverse operation.  Used by the interpreter in the rare
>cases where it really needs to observe the object.  For example, for a
>conditional jump, after it obtained the truth-value with (say) a call to the
>truth() method, it must pull the result out of the object space to know
>whether it is really False or True.
>For example, the straightforward object space that just uses real Python 
>objects to implement themselves would be defined like that:
>class BorrowingObjectSpace(ObjectSpace):
>    def wrap(self, x):
>        return x
>    def unwrap(self, x):
>        return x
>    def add(self, x, y):
>        return x+y
>    def type(self, x):
>        return type(x)
>    ...
>Cool things can be imagined with other kinds of object spaces.  We may even
>think about a distributed interpreter, running on one machine with object
>spaces that actually reside on other machines!  With this perspective, wrap(x)
>sends x to the remote machine and returns a handle to it, and unwrap(x)
>downloads from the remote machine the object whose handle is x.
>We may even have several concurrently-running object spaces that could 
>communicate, e.g. running a frame in a different object space and marshalling 
>arguments and the return value!
>Looks like a cool point of view, doesn't it ?
>A bientôt,
>pypy-dev at codespeak.net

More information about the Pypy-dev mailing list