[pypy-dev] Object model

Thu Feb 13 22:26:37 CET 2003

From: "Armin Rigo" <arigo at tunes.org>
> Hello everybody,
>
> I have been experimenting a bit about that distinction I tried to draw
between
> "application-level" and "interpreter-level" objects.  I still think the idea
> to be essential but I "reorganized" my thoughts a little bit.
>
> Now I believe in following the idea that CPython's main loop does not assume
> anything at all about the PyObjects it manipulates, but does every operation
> via calls to library functions like PyNumber_Add().  All it does is moving
> PyObjects around, mainly between the stack and the locals.  Similarily, our
> own main loop should not assume anything about the objects it stores in the
> stack and the locals, not even that they have some well-known methods.
> Instead, we would provide it with the equivalent of PyNumber_Add() & co.
>
> Now the fun part of it is that there is no reason to use only one fixed set
of
> functions to do the operations.  There are several things we want to be able
> to do with our interpreter, and these will probably correspond to different
> ways of "seeing" what an object is.  We might try several ways to implement
> the objects, for example trying different implementations for what the
> application will always see as a list.  And even, we might want to have
> several concurrent "object spaces" existing at the same time, e.g. to
> implement the idea of multiple VMs in the same process.
>
> Thus it seems natural to define a class ObjectSpace, that defines methods
like
> PyNumber_Add() and PySequence_GetItem().  Each frame would have an
> ''objectspace'' attribute that holds the space in which it runs.  (Maybe
> shorter, more Pythonic names are better, e.g. add() and getitem().)
>
> Thus an object space is:
>
>  * a way to implement objects;
>
>  * from the interpreter's point of view, it is the "library" needed to do any
> operation;
>
>  * from the application's point of view, it is essentially invisible.
>
> Methods:
>
>  * add(x,y), getitem(x,y), etc. for the operations, that take two "objects"
> (black boxes for the interpreter) and returns an "object" (black box too).
>
>  * type(x), taking a black box object and returning a black box object as
> well.
>
>  * wrap(a), taking a *normal* object and putting it into a black box.  Its
> what the interpreter does for a LOAD_CONST, for example: it has a real Python
> object and wants to send it to the object space.  For example, if the object
> space has a custom implementation "class MyList" for lists, then
wrap([1,2,3])
> should create a MyList instance.
>
>  * unwrap(x) is the inverse operation.  Used by the interpreter in the rare
> cases where it really needs to observe the object.  For example, for a
> conditional jump, after it obtained the truth-value with (say) a call to the
> truth() method, it must pull the result out of the object space to know
> whether it is really False or True.
>
> For example, the straightforward object space that just uses real Python
> objects to implement themselves would be defined like that:
>
> class BorrowingObjectSpace(ObjectSpace):
>
>     def wrap(self, x):
>         return x
>
>     def unwrap(self, x):
>         return x
>
>     def add(self, x, y):
>         return x+y
>
>     def type(self, x):
>         return type(x)
>     ...
>
> Cool things can be imagined with other kinds of object spaces.  We may even
> think about a distributed interpreter, running on one machine with object
> spaces that actually reside on other machines!  With this perspective,
wrap(x)
> sends x to the remote machine and returns a handle to it, and unwrap(x)
> downloads from the remote machine the object whose handle is x.
>
> We may even have several concurrently-running object spaces that could
> communicate, e.g. running a frame in a different object space and marshalling
> arguments and the return value!
>
>
> Looks like a cool point of view, doesn't it ?
>

I think it is even a necessary abstraction in some form (In a mail I suggested
something like this but in the form of set of functions, not an object),
because the bytecode / main loop is an artifact of _some_ possible Python
implementations not all, and now implementing the abstraction as policy pattern
increase flexibility.

OTOH it would be a pity if the result is to avoid tackling the problem of
describing the Python semantics relative to objects in Python in an __execution
substrate (at least partially) indipendent__ way.

IOW if the result is that we get different ObjectSpace definitions for each
possible target (C,OCaml,...) without any implementation sharing among them, we
both get maximal freedom but also maximal duplication.

regards.