[Python-3000] Fixing super anyone?

Fri Apr 20 22:06:56 CEST 2007

On 4/20/07, Guido van Rossum <guido at python.org> wrote:
>
> On 4/20/07, Thomas Wouters <thomas at python.org> wrote:
> > I've had this near the top of my (Python) TODO list for a while, but I
> > haven't been able to find the time. I've considered it while doing the
> > dishes and such, though.
>
> (Not in the shower? :-)

Yes, there too, when I shower alone; I was going to spare everyone the
visual though :)

> I can think of two ways of doing the underlying
> > magic "properly", and one way that's too ugly to think about:
> >
> >  1) Add a new type of descriptor-hook, for associating an object with
> the
> > class it is an attribute of. After class creation (in the metaclass
> > __init__), any object with this __associate__ (or whatever) hook gets it
> > called. It's only called for objects that are attributes of *that*
> class,
> > not of any superclasses (since they will already be associated with the
> > superclass.) I'm sure there are other uses for this hook, just not
> methods
> > that want to use super() -- like zope interfaces ;)
>
> Nice and elegant. Though what would the hook *do*? I.e. where does it
> store the class reference for super to find? Perhaps this is where my
> idea of a designated cell comes in handy (although you'd end up with a
> cell per method instead of all methods sharing one cell).
>
> Also, what arguments are passed to the hook? For this purpose the
> class object is the only thing needed -- but is that always
> sufficient?

The hook would 'bind' the function to the class -- in some unspecified way.
Unspecified because I haven't thought that far ahead yet, and I don't forsee
any particular difficulties. Either the function gets wrapped in an
'associated function' type that knows which class it was defined in (so
somewhat different from the current 'unbound method' type, but not much) or
we just store the associated-class in the existing function object (probably
the better option.). In either case we need some new magic to pass this
class-object to the actual code object, or give the code object a way to
refer to the function object (sometimes also useful for other reasons; it's
not an uncommon request, in newbie areas.)

As for the arguments to this hook -- I guess passing the attribute name as
well as the class object makes sense. It's not much more overhead, and there
isn't much more we could sensibly pass that isn't easily available given the
class object and attribute name.

Phillip mentioned class decorators -- I'd forgotten about those. I don't see
much use for them myself, as metaclasses make much more sense to me, in
particular because they are naturally inherited, whereas class decorators
are (I assume) not inherited. I'm not sure how they should -- or could --
work together with super. It would only be a problem if the class decorator
returns a wrapper class, rather than mutate the class. I guess we'd need a
way to tell associated-functions "instead of ever visiting this class, visit
this wrapper class instead. But don't mess with MRO"... I think. I'm not
entirely sure how class decorators would affect MRO, really.

>  2) Add a new argument to __get__ (or a new method that acts like __get__
> > but with the extra argument, and is called instead of __get__ if it is
> > defined) where the extra argument is the class that the retrieved
> descriptor
> > is actually an attribute of. It currently (correctly) gets the class the
> > attribute was retrieved through, which could be a subclass.
>
> I'm not so keen on this; calling an existing API with an extra
> argument is fraught with transitional problems (2to3 notwithstanding).
> And checking for two different special methods slows things down. I'd
> propose to pass the class where it's found as the 3rd arg always,
> since (almost) nobody uses this argument, but I'm not sure of the
> consequences for class methods.

The consequence for classmethods would be bad. They would always get the
class they were defined on as first argument, making them completely
pointless. So, let's not do that :)

(Of course, there's the issue what to do if we dynamically define a
> new function, and poke it into a class as a method, and that function
> needs to use super. IMO it's fine to require such a function to use
> the underlying super() machinery -- this must be an exceedignly rare
> case!)

It's easy to handle such cases if we flag super-syntax-using-functions
specially (which is also easy.) type's __setattr__ can complain if such a
function gets assigned to an attribute, or it could do the binding magic
right then. Functions that don't use the super syntax will continue to work
like usual. And super-syntax-using-functions can scream and shout when they
get 'associated' with more than one class, of course.

> In both these cases, the (un)bound method object would end up with
> knowledge
> > of the class it is defined in, and create a local variable with a
> > super-object for the 'super' keyword to use.
>
> Ah, so you propose to store it in a local variable. But where is it
> stored before the function is called? I guess on the "bound method"
> object. But how is the value passed on as part of the call? I like a
> cell a lot better because we have an established way to pass cells
> into calls.

As I mentioned above, I didn't really think about this part. By 'local
variable' I just mean that we don't use a global registry or some such ugly
hack -- it's a reference, stored wherever, specific to that call.

> There still are some grey areas
> > to solve -- what if classes share functions, what if functions share
> > code-objects, etc. #2 is probably a better way to go about it than #1,
> in
> > that regard.
>
> Sharing functions is very rare and IMO it's okay if the mechanism
> doesn't work in that case -- as long as it fails loudly. Sharing code
> objects shouldn't be a problem -- code objects are immutable anyway,
> cells are associated with the function object, not with the code
> object.

Agreed.

> There's also the question of what the super keyword itself should look
> like,
> > e.g.
> >
> >   # declaration-style, no dot
> >   res = super currentmethod(arg, arg)
> >   # treat super like self
> >   res = super.currentmethod (arg, arg)
> >   # treat super like self.currentmethod
> >   res = super(arg, arg)
> >
> > I think super.currentmethod(arg, arg) makes the most sense, but that may
> be
> > because it most closely resembles the current practice. However, it may
> call
> > the "wrong" supermethod when the class does, for instance, '__repr__ =
> > __str__'.
>
> Depends on how you define "wrong". :-)

That's why it's in "quotes" :) No matter what we do, it will always be wrong
for someone somewhere, I'm sure.

I am strongly in favor of super.currentmethod(...) if only because
> that's closest to how we've always done it, and there may even be a
> use case for callong some *other* super method.

I agree. The big downside of thinking of these things off-line (like in the
shower) is that it's hard to visualize the actual code. The moment I wrote
down the three alternatives I'd imagined, I realized 'super.currentmethod()'
is the only sane spelling ;)

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a signature-virus-eating-signature-virus-resistant .signature virus!
copy me into your .signature file to help me spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070420/cb041827/attachment-0001.html