[pypy-dev] Object model

Armin Rigo arigo at tunes.org
Fri Feb 14 18:05:37 CET 2003


Hello Rocco,

On Thu, Feb 13, 2003 at 09:50:39PM -0500, Rocco Moretti wrote:
> I was originally going to say I thought the general-dispatch-function and 
> the standard-object-method-interface options were practically equivalent, 
> but I changed my mind.
> 
> The reason lies along the concept of multi-methods.

That was the point.  A more minor point is that the methods add() etc. now
automatically have a hidden "self" argument, which can be used to determine
the context in which we are working (more specifically, the object space, but
then from the object space we might come back to the currently executing
frame).

> I wonder whether wrap() and unwrap() are really nessasary - they seem to 
> blur the distinction between interpreter-level and application-object-
> implementation-level.

I think that the generality of these methods might be very useful in the
future.  If instead we tried to identify all the spots in the interpreter that
require a specific treatment that could be done in generality with wrap() or
unwrap(), then we would be more dependent on the current Python VM and
bytecodes.  I admit there are also drawbacks, but I would rather say that
there are mainly optimization drawbacks.  We can always later add a
"hint" optional parameter to wrap() and unwrap() to let the ObjectSpace know
for which particular reason the method is called.

The wrap()/unwrap() symmetry is also nice because it allows us to define a
"default" implementation for all ObjectSpace methods:

   def add(self, x, y):
       x0, y0 = self.unwrap(x0), self.unwrap(y0)
       z0 = x0 + y0
       return self.wrap(z0)

If you think about unwrap()/wrap() as respectively downloading/uploading the
object through a network, it's not the fastest implementation, but it works!


> It seems to me that the LOAD_CONST issue woold be better dealt with by 
> having the compiler/function loader create the black box objects 
> themselves, instead of there being a run-time translation between 
> interpreter-level objects and application-level objects.

I don't think it's so clear.  LOAD_CONST is not the only place where we would
need wrap().  The bytecode interpreter may catch a RuntimeError or
MemoryError, for example, and then it would want to send it to the
application.  As for LOAD_CONST, it's not clear where (interpreter- or
object-space) the constants are best stored.  Think about implementing Pyrex,
the Python-to-C translator: it could be done with a PyrexObjectSpace whose
add() method would only emit the following line of C code:

    v3 = PyNumber_Add(v1, v2);

and the "black box" objects would only store the name of the C variable, not
an actual value.  If you have an explicit wrap() operation, then whenever a
LOAD_CONST is seen, PyrexObjectSpace emits C code like:

    v5 = PyInt_FromLong(123);

If on the other hand the LOAD_CONST is invisible to the object space, then
PyrexObjectSpace must pre-build all the constants and put them into global C
variables, whose names are transmitted to the pypy main loop.  It might be a
good optimization to do so, but then it might not always be the case (e.g. if
we are targetting machine code instead of C and don't have so many registers
available).  With an explicit wrap() you can choose to "cache" all or some
constants in global variables, or not.  Without the wrap() you are forced to
"cache" them all.

> > * unwrap(x) is the inverse operation.
> 
> But how would we handle it if we wanted to redefine what was 
> considered "true"?  A generic unwrap couldn't tell what the unwrapped 
> object is being used for.

Yes, that's right.  It's actually the most difficult operation; for example,
in PyrexObjectSpace, you cannot actually unwrap any object because it will
later be in some variable of your emitted C code, but not now.  I still think
that we should try to provide a generic unwrap(), because it is essential for
"network" object spaces where it represents downloading, or for Psyco where it
represents suspending compilation, running the code, waiting for the execution
point to reach that point, and "pulling" the actual value out of the
registers.

ObjectSpaces may only partially implement wrap()/unwrap(), i.e. it would be
acceptable for them to fail to wrap or unwrap a value.  For example, it can be
expected that PyrexObjectSpace can only wrap "literal constants", like
integers, strings, and floats (by emitting a call to PyXxx_FromYyy()).  
Conversely, it can only unwrap objects returned by its own truth() method,
where it knows that it must be either False or True.  (It still doesn't know
whether it is False or True, but it can *duplicate* the caller frame and
returns False once and True once, so that both paths will be tried.)

The point of this lengthly explanation is to show that the difficulty does not
lie directly in "unwrap() vs. specialized-versions-of-it()", because both have
the same problem in some ObjectSpaces.  I expect the same problem to show up
in any specialized version of unwrap().  The rest is optimization hints.


A bientôt,

Armin.



More information about the Pypy-dev mailing list