Mailman 3 Re: [Python-Dev] subclassing builtin data structures - Python-Dev

newer
PEP 486: Make the Python Launcher...

Re: [Python-Dev] subclassing builtin data structures

older
Can I replace...

Isaac Schwabacher

13 Feb 2015 13 Feb '15

2:37 p.m.

On 15-02-13, Guido van Rossum wrote:

...

Are you willing to wait 10 days for an answer? I'm out of round tuits for a while.

IIUC, the argument is that the Liskov Substitution Principle is a statement about how objects of a subtype behave relative to objects of a supertype, and it doesn't apply to constructors because they aren't behaviors of existing objects. So other overriding methods *should* be able to handle the same inputs that the respective overridden methods do, but constructors don't need to. Even though __init__ is written as an instance method, it seems like it's "morally" a part of the class method __new__ that's only split off for convenience. If this message is unclear, it's because I don't really understand this myself and I'm trying to articulate my best understanding of what's been said on this thread and those it links to. ijs

...

On Fri, Feb 13, 2015 at 10:22 AM, Alexander Belopolsky <alexander.belopolsky@gmail.com(javascript:main.compose()> wrote:

...
On Fri, Feb 13, 2015 at 1:19 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com(javascript:main.compose()> wrote:

...
...
FWIW you're wrong when you claim that "a constructor is no different from any other method". Someone else should probably explain this (it's an old argument that's been thoroughly settled).

Well, the best answer I've got in the past [1] was "ask on python-dev since Guido called the operator overriding expectation." :-)

And let me repost this bit of history [1]:

Here is the annotated pre-r82065 code:

39876 gvanrossum def __add__(self, other): 39876 gvanrossum if isinstance(other, timedelta): 39928 gvanrossum return self.__class__(self.__days + other.__days, 39876 gvanrossum self.__seconds + other.__seconds, 39876 gvanrossum self.__microseconds + other.__microseconds) 40207 tim_one return NotImplemented 39876 gvanrossum

[1] http://bugs.python.org/issue2267#msg125979

-- --Guido van Rossum (python.org/~guido(http://python.org/~guido))

Show replies by date

Neil Girdhar

13 Feb 13 Feb

3:44 p.m.

New subject: subclassing builtin data structures

Interesting: http://stackoverflow.com/questions/5490824/should-constructors-comply-with-t... On Fri, Feb 13, 2015 at 3:37 PM, Isaac Schwabacher <ischwabacher@wisc.edu> wrote:

...

On 15-02-13, Guido van Rossum wrote:

...
Are you willing to wait 10 days for an answer? I'm out of round tuits for a while.

IIUC, the argument is that the Liskov Substitution Principle is a statement about how objects of a subtype behave relative to objects of a supertype, and it doesn't apply to constructors because they aren't behaviors of existing objects. So other overriding methods *should* be able to handle the same inputs that the respective overridden methods do, but constructors don't need to. Even though __init__ is written as an instance method, it seems like it's "morally" a part of the class method __new__ that's only split off for convenience.

If this message is unclear, it's because I don't really understand this myself and I'm trying to articulate my best understanding of what's been said on this thread and those it links to.

ijs

...
On Fri, Feb 13, 2015 at 10:22 AM, Alexander Belopolsky < alexander.belopolsky@gmail.com(javascript:main.compose()> wrote:

...
On Fri, Feb 13, 2015 at 1:19 PM, Alexander Belopolsky <

...
...
...
FWIW you're wrong when you claim that "a constructor is no

different from any other method". Someone else should probably explain this (it's an old argument that's been thoroughly settled).

Well, the best answer I've got in the past [1] was "ask on

alexander.belopolsky@gmail.com(javascript:main.compose()> wrote: python-dev since Guido called the operator overriding expectation." :-)

...
And let me repost this bit of history [1]:

Here is the annotated pre-r82065 code:

39876 gvanrossum def __add__(self, other): 39876 gvanrossum if isinstance(other, timedelta): 39928 gvanrossum return self.__class__(self.__days + other.__days, 39876 gvanrossum self.__seconds + other.__seconds, 39876 gvanrossum self.__microseconds + other.__microseconds) 40207 tim_one return NotImplemented 39876 gvanrossum

[1] http://bugs.python.org/issue2267#msg125979

-- --Guido van Rossum (python.org/~guido(http://python.org/~guido))

Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/mistersheik%40gmail.com

Alexander Belopolsky

4:55 p.m.

New subject: subclassing builtin data structures

On Fri, Feb 13, 2015 at 4:44 PM, Neil Girdhar <mistersheik@gmail.com> wrote:

...

Interesting: http://stackoverflow.com/questions/5490824/should-constructors-comply-with-t...

Let me humbly conjecture that the people who wrote the top answers have background in less capable languages than Python. Not every language allows you to call self.__class__(). In the languages that don't you can get away with incompatible constructor signatures. However, let me try to focus the discussion on a specific issue before we go deep into OOP theory. With python's standard datetime.date we have:

...

...
...
from datetime import * class Date(date): ... pass ... Date.today() Date(2015, 2, 13) Date.fromordinal(1) Date(1, 1, 1)

Both .today() and .fromordinal(1) will break in a subclass that redefines __new__ as follows:

...

...
...
class Date2(date): ... def __new__(cls, ymd): ... return date.__new__(cls, *ymd) ... Date2.today() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: __new__() takes 2 positional arguments but 4 were given Date2.fromordinal(1) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: __new__() takes 2 positional arguments but 4 were given

Why is this acceptable, but we have to sacrifice the convenience of having Date + timedelta return Date to make it work with Date2:

...

...
...
Date2((1,1,1)) + timedelta(1) datetime.date(1, 1, 2)

Neil Girdhar

5:03 p.m.

New subject: subclassing builtin data structures

I personally don't think this is a big enough issue to warrant any changes, but I think Serhiy's solution would be the ideal best with one additional parameter: the caller's type. Something like def __make_me__(self, cls, *args, **kwargs) and the idea is that any time you want to construct a type, instead of self.__class__(assumed arguments…) where you are not sure that the derived class' constructor knows the right argument types, you do def SomeCls: def some_method(self, ...): return self.__make_me__(SomeCls, assumed arguments…) Now the derived class knows who is asking for a copy. In the case of defaultdict, for example, he can implement __make_me__ as follows: def __make_me__(self, cls, *args, **kwargs): if cls is dict: return default_dict(self.default_factory, *args, **kwargs) return default_dict(*args, **kwargs) essentially the caller is identifying himself so that the receiver knows how to interpret the arguments. Best, Neil On Fri, Feb 13, 2015 at 5:55 PM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:

...

On Fri, Feb 13, 2015 at 4:44 PM, Neil Girdhar <mistersheik@gmail.com> wrote:

...
Interesting: http://stackoverflow.com/questions/5490824/should-constructors-comply-with-t...

Let me humbly conjecture that the people who wrote the top answers have background in less capable languages than Python.

Not every language allows you to call self.__class__(). In the languages that don't you can get away with incompatible constructor signatures.

However, let me try to focus the discussion on a specific issue before we go deep into OOP theory.

With python's standard datetime.date we have:

...
...
...
from datetime import * class Date(date): ... pass ... Date.today() Date(2015, 2, 13) Date.fromordinal(1) Date(1, 1, 1)

Both .today() and .fromordinal(1) will break in a subclass that redefines __new__ as follows:

...
...
...
class Date2(date): ... def __new__(cls, ymd): ... return date.__new__(cls, *ymd) ... Date2.today() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: __new__() takes 2 positional arguments but 4 were given Date2.fromordinal(1) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: __new__() takes 2 positional arguments but 4 were given

Why is this acceptable, but we have to sacrifice the convenience of having Date + timedelta return Date to make it work with Date2:

...
...
...
Date2((1,1,1)) + timedelta(1) datetime.date(1, 1, 2)

Serhiy Storchaka

14 Feb 14 Feb

12:22 a.m.

New subject: subclassing builtin data structures

On 14.02.15 01:03, Neil Girdhar wrote:

...

Now the derived class knows who is asking for a copy. In the case of defaultdict, for example, he can implement __make_me__ as follows:

def __make_me__(self, cls, *args, **kwargs): if cls is dict: return default_dict(self.default_factory, *args, **kwargs) return default_dict(*args, **kwargs)

essentially the caller is identifying himself so that the receiver knows how to interpret the arguments.

No, my idea was that __make_me__ has the same signature in all subclasses. It takes exactly one argument and creates an instance of concrete class, so it never fails. If you want to create an instance of different class in the derived class, you should explicitly override __make_me__.

Steven D'Aprano

6:23 a.m.

New subject: subclassing builtin data structures

On Fri, Feb 13, 2015 at 06:03:35PM -0500, Neil Girdhar wrote:

...

I personally don't think this is a big enough issue to warrant any changes, but I think Serhiy's solution would be the ideal best with one additional parameter: the caller's type. Something like

def __make_me__(self, cls, *args, **kwargs)

and the idea is that any time you want to construct a type, instead of

self.__class__(assumed arguments…)

where you are not sure that the derived class' constructor knows the right argument types, you do

def SomeCls: def some_method(self, ...): return self.__make_me__(SomeCls, assumed arguments…)

Now the derived class knows who is asking for a copy.

What if you wish to return an instance from a classmethod? You don't have a `self` available. class SomeCls: def __init__(self, x, y, z): ... @classmethod def from_spam(cls, spam): x, y, z = process(spam) return cls.__make_me__(self, cls, x, y, z) # oops, no self Even if you are calling from an instance method, and self is available, you cannot assume that the information needed for the subclass constructor is still available. Perhaps that information is used in the constructor and then discarded. The problem we wish to solve is that when subclassing, methods of some base class blindly return instances of itself, instead of self's type: py> class MyInt(int): ... pass ... py> n = MyInt(23) py> assert isinstance(n, MyInt) py> assert isinstance(n+1, MyInt) Traceback (most recent call last): File "<stdin>", line 1, in ? AssertionError The means that subclasses often have to override all the parent's methods, just to ensure the type is correct: class MyInt(int): def __add__(self, other): o = super().__add__(other) if o is not NotImplemented: o = type(self)(o) return o Something like that, repeated for all the int methods, should work: py> n = MyInt(23) py> type(n+1) <class '__main__.MyInt'> This is tedious and error prone, but at least once it is done, subclasses of MyInt will Just Work: py> class MyOtherInt(MyInt): ... pass ... py> a = MyOtherInt(42) py> type(a + 1000) <class '__main__.MyOtherInt'> (At least, *in general* they will work. See below.) So, why not have int's methods use type(self) instead of hard coding int? The answer is that *some* subclasses might override the constructor, which would cause the __add__ method to fail: # this will fail if the constructor has a different signature o = type(self)(o) Okay, but changing the constructor signature is quite unusual. Mostly, people subclass to add new methods or attributes, or to override a specific method. The dict/defaultdict situation is relatively uncommon. Instead of requiring *every* subclass to override all the methods, couldn't we require the base classes (like int) to assume that the signature is unchanged and call type(self), and leave it up to the subclass to override all the methods *only* if the signature has changed? (Which they probably would have to do anyway.) As the MyInt example above shows, or datetime in the standard library, this actually works fine in practice: py> from datetime import datetime py> class MySpecialDateTime(datetime): ... pass ... py> t = MySpecialDateTime.today() py> type(t) <class '__main__.MySpecialDateTime'> Why can't int, str, list, tuple etc. be more like datetime? -- Steve

Steve Dower

8:53 a.m.

New subject: subclassing builtin data structures

"Instead of requiring *every* subclass to override all the methods, couldn't we require the base classes (like int) to assume that the signature is unchanged and call type(self), and leave it up to the subclass to override all the methods *only* if the signature has changed?" I assumed everyone was just saying this point over and over, so I haven't been following the thread closely. This is precisely how inheritance works: subclasses are constrained by the base class. If you want to play, you *must* play by its rules. (Else, use composition.) It's fine for base classes to assume a compatible constructor, and if builtins can do it without hurting performance for the non-subclassed case, I don't see why not. Cheers, Steve Top-posted from my Windows Phone ________________________________ From: Steven D'Aprano<mailto:steve@pearwood.info> Sent: ‎2/‎14/‎2015 4:24 To: python-dev@python.org<mailto:python-dev@python.org> Subject: Re: [Python-Dev] subclassing builtin data structures On Fri, Feb 13, 2015 at 06:03:35PM -0500, Neil Girdhar wrote:

...

I personally don't think this is a big enough issue to warrant any changes, but I think Serhiy's solution would be the ideal best with one additional parameter: the caller's type. Something like

def __make_me__(self, cls, *args, **kwargs)

and the idea is that any time you want to construct a type, instead of

self.__class__(assumed arguments…)

where you are not sure that the derived class' constructor knows the right argument types, you do

def SomeCls: def some_method(self, ...): return self.__make_me__(SomeCls, assumed arguments…)

Now the derived class knows who is asking for a copy.

Alexander Belopolsky

12:26 p.m.

New subject: subclassing builtin data structures

On Sat, Feb 14, 2015 at 7:23 AM, Steven D'Aprano <steve@pearwood.info> wrote:

...

Why can't int, str, list, tuple etc. be more like datetime?

They are. In all these types, class methods call subclass constructors but instance methods don't.

...

...
...
class Int(int): ... pass ... Int.from_bytes(bytes([1,2,3]), 'big') 66051 type(_) <class '__main__.Int'>

...

...
...
Int(1) + 1 2 type(_) <class 'int'>

In the case of int, there is a good reason for this behavior - bool. In python, we want True + True == 2. In numpy, where binary operations preserve subclasses, you have

...

...
...
import numpy numpy.bool_(1) + numpy.bool_(1) True

I don't see a similar argument for the date class, however. Given date.{to|from}ordinal(), date subclasses are pretty much bound to have timedelta addition satisfy (d + td).toordinal() == d.toordinal() + td.days. Any other definition would be fighting the baseclass design and would be better implemented via containment.

Georg Brandl

1:36 p.m.

New subject: subclassing builtin data structures

On 02/14/2015 07:26 PM, Alexander Belopolsky wrote:

...

In the case of int, there is a good reason for this behavior - bool. In python, we want True + True == 2. In numpy, where binary operations preserve subclasses, you have

...
...
...
import numpy numpy.bool_(1) + numpy.bool_(1) True

I don't think numpy.bool_ subclasses some class like numpy.int_. Georg

Alexander Belopolsky

2:47 p.m.

New subject: subclassing builtin data structures

On Sat, Feb 14, 2015 at 2:36 PM, Georg Brandl <g.brandl@gmx.net> wrote:

...

...
In the case of int, there is a good reason for this behavior - bool. In python, we want True + True == 2. In numpy, where binary operations preserve subclasses, you have

...
...
...
import numpy numpy.bool_(1) + numpy.bool_(1) True

I don't think numpy.bool_ subclasses some class like numpy.int_.

And numpy.bool_ subclasses don't preserve type in addition:

...

...
...
import numpy class Bool(numpy.bool_): ... pass ... numpy.bool_.mro() [<class 'numpy.bool_'>, <class 'numpy.generic'>, <class 'object'>] Bool(1) + Bool(1) True type(_) <class 'numpy.bool_'>

So there goes my theory. :-) I think all these examples just highlight the need for a clear guidance when self.__class__() can be called in base classes to construct instances of derived classes. Apparently numpy has it both ways. One way for scalars (see above) and the other for arrays:

...

...
...
class Array(numpy.ndarray): ... pass ... a = Array(1) a[0] = 1 a+a Array([ 2.])

Steven D'Aprano

8:04 p.m.

New subject: subclassing builtin data structures

On Sat, Feb 14, 2015 at 01:26:36PM -0500, Alexander Belopolsky wrote:

...

On Sat, Feb 14, 2015 at 7:23 AM, Steven D'Aprano <steve@pearwood.info> wrote:

...
Why can't int, str, list, tuple etc. be more like datetime?

They are. In all these types, class methods call subclass constructors but instance methods don't.

But in datetime, instance methods *do*. Sorry that my example with .today() was misleading. py> from datetime import datetime py> class MyDatetime(datetime): ... pass ... py> MyDatetime.today() MyDatetime(2015, 2, 15, 12, 45, 38, 429269) py> MyDatetime.today().replace(day=20) MyDatetime(2015, 2, 20, 12, 45, 53, 405889)

...

In the case of int, there is a good reason for this behavior - bool. In python, we want True + True == 2.

Sure. But bool is only one subclass. I expect that it should be bool's responsibility to override __add__ etc. to return an instance of the parent class (int) rather have nearly all subclasses have to override __add__ etc. to return instances of themselves. -- Steve

Neil Girdhar

2:15 p.m.

New subject: subclassing builtin data structures

I think the __make_me__ pattern discussed earlier is still the most generic cooperative solution. Here it is with a classmethod version too: class C(D, E): def some_method(self): return __make_me__(self, C) def __make_me__(self, arg_cls, *args, **kwargs): if arg_cls is C: pass elif issubclass(D, arg_cls): args, kwargs = modified_args_for_D(args, kwargs) elif issubclass(E, arg_cls): args, kwargs = modified_args_for_D(args, kwargs) else: raise ValueError if self.__class__ == C: return C(*args, **kwargs) return self.__make_me__(C, *args, **kwargs) @classmethod def __make_me_cls__(cls, arg_cls, *args, **kwargs): if arg_cls is C: pass elif issubclass(D, arg_cls): args, kwargs = modified_args_for_D(args, kwargs) elif issubclass(E, arg_cls): args, kwargs = modified_args_for_D(args, kwargs) else: raise ValueError if cls == C: return C(*args, **kwargs) return cls.__make_me_cls__(C, *args, **kwargs) On Sat, Feb 14, 2015 at 7:23 AM, Steven D'Aprano <steve@pearwood.info> wrote:

...

On Fri, Feb 13, 2015 at 06:03:35PM -0500, Neil Girdhar wrote:

...
I personally don't think this is a big enough issue to warrant any changes, but I think Serhiy's solution would be the ideal best with one additional parameter: the caller's type. Something like

def __make_me__(self, cls, *args, **kwargs)

and the idea is that any time you want to construct a type, instead of

self.__class__(assumed arguments…)

where you are not sure that the derived class' constructor knows the right argument types, you do

def SomeCls: def some_method(self, ...): return self.__make_me__(SomeCls, assumed arguments…)

Now the derived class knows who is asking for a copy.

What if you wish to return an instance from a classmethod? You don't have a `self` available.

class SomeCls: def __init__(self, x, y, z): ... @classmethod def from_spam(cls, spam): x, y, z = process(spam) return cls.__make_me__(self, cls, x, y, z) # oops, no self

Even if you are calling from an instance method, and self is available, you cannot assume that the information needed for the subclass constructor is still available. Perhaps that information is used in the constructor and then discarded.

The problem we wish to solve is that when subclassing, methods of some base class blindly return instances of itself, instead of self's type:

py> class MyInt(int): ... pass ... py> n = MyInt(23) py> assert isinstance(n, MyInt) py> assert isinstance(n+1, MyInt) Traceback (most recent call last): File "<stdin>", line 1, in ? AssertionError

The means that subclasses often have to override all the parent's methods, just to ensure the type is correct:

class MyInt(int): def __add__(self, other): o = super().__add__(other) if o is not NotImplemented: o = type(self)(o) return o

Something like that, repeated for all the int methods, should work:

py> n = MyInt(23) py> type(n+1) <class '__main__.MyInt'>

This is tedious and error prone, but at least once it is done, subclasses of MyInt will Just Work:

py> class MyOtherInt(MyInt): ... pass ... py> a = MyOtherInt(42) py> type(a + 1000) <class '__main__.MyOtherInt'>

(At least, *in general* they will work. See below.)

So, why not have int's methods use type(self) instead of hard coding int? The answer is that *some* subclasses might override the constructor, which would cause the __add__ method to fail:

# this will fail if the constructor has a different signature o = type(self)(o)

Okay, but changing the constructor signature is quite unusual. Mostly, people subclass to add new methods or attributes, or to override a specific method. The dict/defaultdict situation is relatively uncommon.

Instead of requiring *every* subclass to override all the methods, couldn't we require the base classes (like int) to assume that the signature is unchanged and call type(self), and leave it up to the subclass to override all the methods *only* if the signature has changed? (Which they probably would have to do anyway.)

As the MyInt example above shows, or datetime in the standard library, this actually works fine in practice:

py> from datetime import datetime py> class MySpecialDateTime(datetime): ... pass ... py> t = MySpecialDateTime.today() py> type(t) <class '__main__.MySpecialDateTime'>

Why can't int, str, list, tuple etc. be more like datetime?

-- Steve _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/mistersheik%40gmail.com

Neil Girdhar

2:42 p.m.

New subject: subclassing builtin data structures

Oops, I meant to call super if necessary: @classmethod def __make_me_cls__(cls, arg_cls, *args, **kwargs): if arg_cls is C: pass elif arg_cls is D: args, kwargs = modified_args_for_D(args, kwargs) elif arg_cls is E: args, kwargs = modified_args_for_D(args, kwargs) else: return super().__make_me_cls__(arg_cls, args, kwargs) if cls is C: return C(*args, **kwargs) return cls.__make_me_cls__(C, *args, **kwargs) On Sat, Feb 14, 2015 at 3:15 PM, Neil Girdhar <mistersheik@gmail.com> wrote:

...

I think the __make_me__ pattern discussed earlier is still the most generic cooperative solution. Here it is with a classmethod version too:

class C(D, E): def some_method(self): return __make_me__(self, C)

def __make_me__(self, arg_cls, *args, **kwargs): if arg_cls is C: pass elif issubclass(D, arg_cls): args, kwargs = modified_args_for_D(args, kwargs) elif issubclass(E, arg_cls): args, kwargs = modified_args_for_D(args, kwargs) else: raise ValueError

if self.__class__ == C: return C(*args, **kwargs) return self.__make_me__(C, *args, **kwargs)

@classmethod def __make_me_cls__(cls, arg_cls, *args, **kwargs): if arg_cls is C: pass elif issubclass(D, arg_cls): args, kwargs = modified_args_for_D(args, kwargs) elif issubclass(E, arg_cls): args, kwargs = modified_args_for_D(args, kwargs) else: raise ValueError

if cls == C: return C(*args, **kwargs) return cls.__make_me_cls__(C, *args, **kwargs)

On Sat, Feb 14, 2015 at 7:23 AM, Steven D'Aprano <steve@pearwood.info> wrote:

...
On Fri, Feb 13, 2015 at 06:03:35PM -0500, Neil Girdhar wrote:

...
I personally don't think this is a big enough issue to warrant any changes, but I think Serhiy's solution would be the ideal best with one additional parameter: the caller's type. Something like

def __make_me__(self, cls, *args, **kwargs)

and the idea is that any time you want to construct a type, instead of

self.__class__(assumed arguments…)

where you are not sure that the derived class' constructor knows the right argument types, you do

def SomeCls: def some_method(self, ...): return self.__make_me__(SomeCls, assumed arguments…)

Now the derived class knows who is asking for a copy.

What if you wish to return an instance from a classmethod? You don't have a `self` available.

class SomeCls: def __init__(self, x, y, z): ... @classmethod def from_spam(cls, spam): x, y, z = process(spam) return cls.__make_me__(self, cls, x, y, z) # oops, no self

Even if you are calling from an instance method, and self is available, you cannot assume that the information needed for the subclass constructor is still available. Perhaps that information is used in the constructor and then discarded.

The problem we wish to solve is that when subclassing, methods of some base class blindly return instances of itself, instead of self's type:

py> class MyInt(int): ... pass ... py> n = MyInt(23) py> assert isinstance(n, MyInt) py> assert isinstance(n+1, MyInt) Traceback (most recent call last): File "<stdin>", line 1, in ? AssertionError

The means that subclasses often have to override all the parent's methods, just to ensure the type is correct:

class MyInt(int): def __add__(self, other): o = super().__add__(other) if o is not NotImplemented: o = type(self)(o) return o

Something like that, repeated for all the int methods, should work:

py> n = MyInt(23) py> type(n+1) <class '__main__.MyInt'>

This is tedious and error prone, but at least once it is done, subclasses of MyInt will Just Work:

py> class MyOtherInt(MyInt): ... pass ... py> a = MyOtherInt(42) py> type(a + 1000) <class '__main__.MyOtherInt'>

(At least, *in general* they will work. See below.)

So, why not have int's methods use type(self) instead of hard coding int? The answer is that *some* subclasses might override the constructor, which would cause the __add__ method to fail:

# this will fail if the constructor has a different signature o = type(self)(o)

Okay, but changing the constructor signature is quite unusual. Mostly, people subclass to add new methods or attributes, or to override a specific method. The dict/defaultdict situation is relatively uncommon.

Instead of requiring *every* subclass to override all the methods, couldn't we require the base classes (like int) to assume that the signature is unchanged and call type(self), and leave it up to the subclass to override all the methods *only* if the signature has changed? (Which they probably would have to do anyway.)

As the MyInt example above shows, or datetime in the standard library, this actually works fine in practice:

py> from datetime import datetime py> class MySpecialDateTime(datetime): ... pass ... py> t = MySpecialDateTime.today() py> type(t) <class '__main__.MySpecialDateTime'>

Why can't int, str, list, tuple etc. be more like datetime?

-- Steve _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/mistersheik%40gmail.com

Nick Coghlan

3:44 a.m.

New subject: subclassing builtin data structures

On 14 Feb 2015 08:57, "Alexander Belopolsky" <alexander.belopolsky@gmail.com> wrote:

...

On Fri, Feb 13, 2015 at 4:44 PM, Neil Girdhar <mistersheik@gmail.com>

...

...
Interesting:

http://stackoverflow.com/questions/5490824/should-constructors-comply-with-t...

Let me humbly conjecture that the people who wrote the top answers have background in less capable languages than Python.

Not every language allows you to call self.__class__(). In the languages

wrote: that don't you can get away with incompatible constructor signatures.

...

However, let me try to focus the discussion on a specific issue before we

go deep into OOP theory.

...

With python's standard datetime.date we have:

...
...
...
from datetime import * class Date(date): ... pass ... Date.today() Date(2015, 2, 13) Date.fromordinal(1) Date(1, 1, 1)

Both .today() and .fromordinal(1) will break in a subclass that redefines

__new__ as follows:

...

...
...
...
class Date2(date): ... def __new__(cls, ymd): ... return date.__new__(cls, *ymd) ... Date2.today() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: __new__() takes 2 positional arguments but 4 were given Date2.fromordinal(1) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: __new__() takes 2 positional arguments but 4 were given

Why is this acceptable, but we have to sacrifice the convenience of

having Date + timedelta

...

return Date to make it work with Date2:

...
...
...
Date2((1,1,1)) + timedelta(1) datetime.date(1, 1, 2)

Coupling alternative constructors to the default constructor signature is pretty normal - it just means that if you override the signature of the default constructor, you may need to update the alternative ones accordingly. Cheers, Nick.

...

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe:

https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com

...

Greg Ewing

13 Feb 13 Feb

6:19 p.m.

New subject: subclassing builtin data structures

Isaac Schwabacher wrote:

...

IIUC, the argument is that the Liskov Substitution Principle is a statement about how objects of a subtype behave relative to objects of a supertype, and it doesn't apply to constructors because they aren't behaviors of existing objects.

Another way to say that is that constructors are class methods, not instance methods. -- Greg

Nick Coghlan

14 Feb 14 Feb

3:37 a.m.

New subject: subclassing builtin data structures

On 14 Feb 2015 07:39, "Isaac Schwabacher" <ischwabacher@wisc.edu> wrote:

...

On 15-02-13, Guido van Rossum wrote:

...
Are you willing to wait 10 days for an answer? I'm out of round

tuits for a while.

...

IIUC, the argument is that the Liskov Substitution Principle is a

statement about how objects of a subtype behave relative to objects of a supertype, and it doesn't apply to constructors because they aren't behaviors of existing objects. So other overriding methods *should* be able to handle the same inputs that the respective overridden methods do, but constructors don't need to. Even though __init__ is written as an instance method, it seems like it's "morally" a part of the class method __new__ that's only split off for convenience. A potentially helpful example is to consider a type system where Square is a subclass of Rectangle. To specify a Rectangle takes a height and a width, but a Square only needs the length of one side, and letting the height and width be specified independently would be outright wrong. Many possible operations on a Square also *should* formally return a Rectangle, as doing something like doubling the height gives you a result that isn't a square any more. (That's also why shapes must be immutable for this particular type hierarchy to make any sense) (You can construct similar examples for Circle & Ellipse, and Python's numeric hierarchy is a real-world example of some of the complexity that can arise) There's simply no sensible default behaviour other than the status quo in the face of scenarios like that - whether or not there's a better answer than "return an instance of the parent class" for any given inherited method depends entirely on the invariants of the subclass and how they differ from those of the parent class. It's certainly possible to write methods that return a new instance of the current type (that's common in alternative constructors, for example), but it usually involves placing additional expectations on the developers of subclasses. That's most commonly seen within the confines of a single project, or when defining a development framework, rather than in libraries that choose to expose some public types and also supports their use as base types. Cheers, Nick.

...

If this message is unclear, it's because I don't really understand this

myself and I'm trying to articulate my best understanding of what's been said on this thread and those it links to.

...

ijs

...
On Fri, Feb 13, 2015 at 10:22 AM, Alexander Belopolsky <

...

...
...
On Fri, Feb 13, 2015 at 1:19 PM, Alexander Belopolsky <

alexander.belopolsky@gmail.com(javascript:main.compose()> wrote:

...
...
...
...
FWIW you're wrong when you claim that "a constructor is no

different from any other method". Someone else should probably explain this (it's an old argument that's been thoroughly settled).

Well, the best answer I've got in the past [1] was "ask on

alexander.belopolsky@gmail.com(javascript:main.compose()> wrote: python-dev since Guido called the operator overriding expectation." :-)

...

...
...
And let me repost this bit of history [1]:

Here is the annotated pre-r82065 code:

39876 gvanrossum def __add__(self, other): 39876 gvanrossum if isinstance(other, timedelta): 39928 gvanrossum return self.__class__(self.__days + other.__days, 39876 gvanrossum self.__seconds + other.__seconds, 39876 gvanrossum self.__microseconds + other.__microseconds) 40207 tim_one return NotImplemented 39876 gvanrossum

[1] http://bugs.python.org/issue2267#msg125979

-- --Guido van Rossum (python.org/~guido(http://python.org/~guido))

Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com

3554

Age (days ago)

3556

Last active (days ago)

List overview

Download

15 comments

9 participants

participants (9)

Alexander Belopolsky
Georg Brandl
Greg Ewing
Isaac Schwabacher
Neil Girdhar
Nick Coghlan
Serhiy Storchaka
Steve Dower
Steven D'Aprano

Re: [Python-Dev] subclassing builtin data structures

tags

participants (9)