[pypy-dev] Objects and types in the stdobjspace

Tue Jun 10 00:02:55 CEST 2003

Hi Armin,

[Armin Rigo Sat, Jun 07, 2003 at 03:00:50PM +0200]
> Hello Holger,
> 
> On Sat, Jun 07, 2003 at 08:51:51AM +0200, holger krekel wrote:
> > Btw, could someone from the Sweden sprint crew post some
> > notes about how the current pypy-source tree can be used? What is
> > expected to work (or if that list is shorter: what does not work :-).
> 
> I would like to put this information in the Wiki, but I begin to feel confused
> with the page names. There are quite a lot of pages with similar intent but
> different names or vice-versa, and the name "sprint" generally refers to the
> goals or results of one of the sprints, without any logic to know which one.  
> Just browse RecentChanges to see what I mean.
> 
> We should really name the sprints, e.g. HildesheimSprint and GothenburgSprint.
> Can we rename or delete pages ? (Maybe via the svn repository containing the
> wiki instance ?)

I think that's a very good idea (HildesheimSprint etc.). We need to get
the wiki into a better shape, badly. In July i plan to spent more time
to work on advancing the current wiki code and content. A main goal is
to make it easy to edit via a subversion-checkout using your favorite
editor. 

> pypy/testwice.py is a hack around the previous one to run all the tests in 
> both object spaces.
> 
> The individual .../test/test_xxx.py files are still runnable. There is a 
> testsupport.py file in all test directories for glueing purposes; running it 
> directly should execute all the tests in that directory.

Btw, I don't like the duplicated testsupport.py files. I think Rocco already 
suggested to put one version into a fixed place which i agree too. 
pypy.test.support seems like a canonical place. Filename completions
on "te<tab>" should result in "test_" :-) 

> > - a recap of the current StdObjsSpace registration/multimethod
> >   mechanisms.
> 
> The W_XxxObject classes of the standard object space are, precisely,
> *implementations* of objects. This is not the same as the *type* of the
> object. It is possible to provide several implementations for the same type
> (the user should not see the difference, though). Typical uses of this would
> be to hide the int/long disctinction to the user altogether, or more
> interestingly to provide more efficient versions of the data structures like
> string, list or dict when they become large. 

If we would restrict ourself to just one possible implementation of a
type then it wouldn't be neccesary to have a W_XxxObject for each type
but the implementation methods could be directly on the Type object? 

> For example, a complex string implementation can allow constant-time
> concatenation. (This would allow algorithms that build a long string to use +=
> instead of the less readable trick of storing small strings in a list and
> calling ''.join(list) at the end.)  But the simple strings of CPython are much
> better for the typical short strings, hence we really need both.

I am wondering if this design comes from your psyco experience. IIRC
PSYCO improves the common "string+=something" cases by doing the 
list/append dance internally. Which model does PSYCO use to do this
compared to the PyPy-approach? 

> The register() method of a multimethod registers a function that accepts
> arguments with a particular *implementation*, and not a particular *type*.  
> This is clear in your type_repr() :
> 
> >     def type_repr(space, w_obj):
> >         return space.wrap("<type '%s'>" % w_obj.typename)
> 
> This is the repr() for W_TypeObject's, i.e. for types that are implemented as
> a W_TypeObject instance. The W_TypeObject class defines the 'typename'
> attribute, so you can read it there. If we had another different
> implementation for types the above type_repr() would not apply to it.

Any reason not name it "__name__" instead of "typename"? 

> There is no way to dispatch a multimethod based on the type of an argument,

But this could be emulated by registering all implementations of a type,
couldn't it? 

And if you register the signature 

    (W_StringObject, W_ANY)

and the signature 

    (W_ANY, W_StringObject)

for a "MultiMethod" which one gets selected? 
IOW, what are the exact resolving rules ... 
Left-to-right more-specific wins? 

> Let's come back to multimethods. All operators are multimethods, but there are
> also type-specific multimethods (like list.append()) defined in the W_XxxType
> class. This allows the type to provide several implementations of the method
> (typically, if there are several list implementations, each one must define
> its own implementation of append()).
> 
> As an exception to what I just said, the type of an argument of a multimethod
> is actually used for the dispatch in the case where the user refers to this
> multimethod as a (bound or unbound) Python method. For example, when we write
> 'int.__add__(x, y)', the first argument must be an int object (by Python's
> unbound methods rule) 

I'd have thought that the "unbound methods rule" is more a restriction
rooting in the dependence of a python method on its "fixed" data structure of 
the object. If so, wouldn't it be more pythonic to avoid LBYL (Look
before you leap) style of restrictions? 

> In the arguments to register(), the name W_ANY (a synonym for W_Object) means
> that anything at this position is fine. 

(trying hard to avoid to get picky about names but failing ...) why
not use plain W_Object? W_ANY sounds like CORBA to me :-)

Anyway, this all sounds good and makes sense to me even if i am asking
questions all over the place. 

> This can also be used to write default
> implementations that will be called if the multimethod cannot find a more
> specific implementation (or if the more specific implementation raises a
> FailedToImplement exception). There are examples of that in the file
> default.py. Another example would be to provide default type-specific method
> implementations that would work for any implementation of the type, which
> would be used unless a given implementation provides a more efficient version.
> For example :
> 
> def any_list_extend(space, w_list, w_otherlist):
>     return space.inplace_add(w_list, w_otherlist)
> 
> W_ListType.list_extend.register(any_list_extend, W_ANY, W_ANY)

The register methods scattered all over the source make the source
code harder to read IMHO. Any way to introduce a sufficiently strict naming 
scheme and some magic of autoregistering the methods by introspection/analysis
of the names at the end of a file? 

> ...
> > I also noticed that string representations aka 
> > 
> >     "with'mixed'quotes" 
> > 
> > always result in 
> > 
> >     'with'mixed'quotes'
> > 
> > but i wasn't sure how to fix this. Any hints? 
> 
> That's in stringobject.py. str_repr() does no quoting at all, really. Here too
> Michael already replaced it with another equally bad implementation which
> however has the advantage of producing the correct result. The correct thing
> to do is to rewrite the quoting algorithm from stringobject.c.

which requires some more string methods. Lots of ways to go and improve
stuff :-)

greetings,

    holger