[Python-Dev] subclassing builtin data structures

Isaac Schwabacher ischwabacher at wisc.edu
Sat Feb 14 01:55:11 CET 2015


On 15-02-13, Neil Girdhar  wrote:
> Unlike a regular method, you would never need to call super since you should know everyone that could be calling you. Typically, when you call super, you have something like this:
> 
> A < B, C
> 
> 
> B < D
> 
> 
> so you end up with 
> 
> 
> mro: A, B, C, D
> 
> 
> And then when A calls super and B calls super it gets C which it doesn't know about.

But C calls super and gets D. The scenario I'm concerned with is that A knows how to mimic B's constructor and B knows how to mimic D's, but A doesn't know about D. So D asks A if it knows how to mimic D's constructor, and it says no. Via super, B gets a shot, and it does know, so it translates the arguments to D's constructor into arguments to B's constructor, and again asks A if it knows how to handle them. Then A says yes, translates the args, and constructs an A. If C ever gets consulted, it responds "I don't know a thing" and calls super.

> But in the case of make_me, it's someone like C who is calling make_me. If it gets a method in B, then that's a straight-up bug. make_me needs to be reimplemented in A as well, and A would never delegate up since other classes in the mro chain (like B) might not know about C.

This scheme (as I've written it) depends strongly on all the classes in the MRO having __make_me__ methods with this very precisely defined structure: test base against yourself, then any superclasses you care to mimic, then call super. Any antisocial superclass ruins everyone's party.

> Best,
> Neil
> 
> 
> On Fri, Feb 13, 2015 at 7:00 PM, Isaac Schwabacher <alexander.belopolsky at gmail.com <ischwabacher at wisc.edu')" target="1">ischwabacher at wisc.edu> wrote:
> 
> > On 15-02-13, Neil Girdhar wrote:
> > > I personally don't think this is a big enough issue to warrant any changes, but I think Serhiy's solution would be the ideal best with one additional parameter: the caller's type. Something like
> > >
> > > def __make_me__(self, cls, *args, **kwargs)
> > >
> > >
> > > and the idea is that any time you want to construct a type, instead of
> > >
> > >
> > > self.__class__(assumed arguments…)
> > >
> > >
> > > where you are not sure that the derived class' constructor knows the right argument types, you do
> > >
> > >
> > > def SomeCls:
> > > def some_method(self, ...):
> > > return self.__make_me__(SomeCls, assumed arguments…)
> > >
> > >
> > > Now the derived class knows who is asking for a copy. In the case of defaultdict, for example, he can implement __make_me__ as follows:
> > >
> > >
> > > def __make_me__(self, cls, *args, **kwargs):
> > > if cls is dict: return default_dict(self.default_factory, *args, **kwargs)
> > > return default_dict(*args, **kwargs)
> > >
> > >
> > > essentially the caller is identifying himself so that the receiver knows how to interpret the arguments.
> > >
> > >
> > > Best,
> > >
> > >
> > > Neil
> > 
> > Such a method necessarily involves explicit switching on classes... ew.
> > Also, to make this work, a class needs to have a relationship with its superclass's superclasses. So in order for DefaultDict's subclasses not to need to know about dict, it would need to look like this:
> > 
> > class DefaultDict(dict):
> > .... at classmethod # instance method doesn't make sense here
> > ....def __make_me__(cls, base, *args, **kwargs): # make something like base(*args, **kwargs)
> > ........# when we get here, nothing in cls.__mro__ above DefaultDict knows how to construct an equivalent to base(*args, **kwargs) using its own constructor
> > ........if base is DefaultDict:
> > ............return DefaultDict(*args, **kwargs) # if DefaultDict is the best we can do, do it
> > ........elif base is dict:
> > ............return cls.__make_me__(DefaultDict, None, *args, **kwargs) # subclasses that know about DefaultDict but not dict will intercept this
> > ........else:
> > ............super(DefaultDict, cls).__make_me__(base, *args, **kwargs) # we don't know how to make an equivalent to base.__new__(*args, **kwargs), so keep looking
> > 
> > I don't even think this is guaranteed to construct an object of class cls corresponding to a base(*args, **kwargs) even if it were possible, since multiple inheritance can screw things up. You might need to have an explicit list of "these are the superclasses whose constructors I can imitate", and have the interpreter find an optimal path for you.
> > 
> > > On Fri, Feb 13, 2015 at 5:55 PM, Alexander Belopolsky <http://stackoverflow.com/questions/5490824/should-constructors-comply-with-the-liskov-substitution-principle(javascript:main.compose('new', 't=alexander.belopolsky at gmail.com>(java_script:main.compose()> wrote:
> > >
> > > >
> > > > On Fri, Feb 13, 2015 at 4:44 PM, Neil Girdhar <mistersheik at gmail.com <mistersheik at gmail.com>(java_script:main.compose()> wrote:
> > > >
> > > > > Interesting: > > Not every language allows you to call self.__class__(). In the languages that don't you can get away with incompatible constructor signatures.
> > > >
> > > >
> > > > However, let me try to focus the discussion on a specific issue before we go deep into OOP theory.
> > > >
> > > >
> > > > With python's standard datetime.date we have:
> > > >
> > > >
> > > > >>> from datetime import *
> > > > >>> class Date(date):
> > > > ... pass
> > > > ...
> > > > >>> Date.today()
> > > > Date(2015, 2, 13)
> > > > >>> Date.fromordinal(1)
> > > > Date(1, 1, 1)
> > > >
> > > >
> > > > Both .today() and .fromordinal(1) will break in a subclass that redefines __new__ as follows:
> > > >
> > > >
> > > > >>> class Date2(date):
> > > > ... def __new__(cls, ymd):
> > > > ... return date.__new__(cls, *ymd)
> > > > ...
> > > > >>> Date2.today()
> > > > Traceback (most recent call last):
> > > > File "<stdin>", line 1, in <module>
> > > > TypeError: __new__() takes 2 positional arguments but 4 were given
> > > > >>> Date2.fromordinal(1)
> > > > Traceback (most recent call last):
> > > > File "<stdin>", line 1, in <module>
> > > > TypeError: __new__() takes 2 positional arguments but 4 were given
> > > >
> > > >
> > > >
> > > >
> > > > Why is this acceptable, but we have to sacrifice the convenience of having Date + timedelta
> > > > return Date to make it work with Date2:
> > > >
> > > >
> > > > >>> Date2((1,1,1)) + timedelta(1)
> > > > datetime.date(1, 1, 2)
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > 
> > 
> >


More information about the Python-Dev mailing list