Mailman 3 subclassing builtin data structures - Python-Dev

newer
PEP 471 (scandir): Poll to choose...

subclassing builtin data structures

Ethan Furman

12 Feb 2015 12 Feb '15

6:36 p.m.

I suspect the last big hurdle to making built-in data structures nicely subclassable is the insistence of such types to return new instances as the base class instead of the derived class. In case that wasn't clear ;) --> class MyInt(int): ... def __repr__(self): ... return 'MyInt(%d)' % self ... --> m = MyInt(42) --> m MyInt(42) --> m + 1 43 --> type(m+1) <class 'int'> Besides the work it would take to rectify this, I imagine the biggest hurdle would be the performance hit in always looking up the type of self. Has anyone done any preliminary benchmarking? Are there other concerns? -- ~Ethan~

Attachments:

signature.asc (application/pgp-signature — 836 bytes)

Show replies by date

Guido van Rossum

12 Feb 12 Feb

6:55 p.m.

On Thu, Feb 12, 2015 at 4:36 PM, Ethan Furman <ethan@stoneleaf.us> wrote:

...

I suspect the last big hurdle to making built-in data structures nicely subclassable is the insistence of such types to return new instances as the base class instead of the derived class.

In case that wasn't clear ;)

--> class MyInt(int): ... def __repr__(self): ... return 'MyInt(%d)' % self ... --> m = MyInt(42) --> m MyInt(42) --> m + 1 43 --> type(m+1) <class 'int'>

Besides the work it would take to rectify this, I imagine the biggest hurdle would be the performance hit in always looking up the type of self. Has anyone done any preliminary benchmarking? Are there other concerns?

Actually, the problem is that the base class (e.g. int) doesn't know how to construct an instance of the subclass -- there is no reason (in general) why the signature of a subclass constructor should match the base class constructor, and it often doesn't. So this is pretty much a no-go. It's not unique to Python -- it's a basic issue with OO. -- --Guido van Rossum (python.org/~guido)

Ethan Furman

7 p.m.

On 02/12/2015 04:55 PM, Guido van Rossum wrote:

...

On Thu, Feb 12, 2015 at 4:36 PM, Ethan Furman <ethan@stoneleaf.us <mailto:ethan@stoneleaf.us>> wrote:

I suspect the last big hurdle to making built-in data structures nicely subclassable is the insistence of such types to return new instances as the base class instead of the derived class.

In case that wasn't clear ;)

--> class MyInt(int): ... def __repr__(self): ... return 'MyInt(%d)' % self ... --> m = MyInt(42) --> m MyInt(42) --> m + 1 43 --> type(m+1) <class 'int'>

Besides the work it would take to rectify this, I imagine the biggest hurdle would be the performance hit in always looking up the type of self. Has anyone done any preliminary benchmarking? Are there other concerns?

Actually, the problem is that the base class (e.g. int) doesn't know how to construct an instance of the subclass -- there is no reason (in general) why the signature of a subclass constructor should match the base class constructor, and it often doesn't.

So this is pretty much a no-go. It's not unique to Python -- it's a basic issue with OO.

Thank you. -- ~Ethan~

MRAB

7:46 p.m.

On 2015-02-13 00:55, Guido van Rossum wrote:

...

On Thu, Feb 12, 2015 at 4:36 PM, Ethan Furman <ethan@stoneleaf.us <mailto:ethan@stoneleaf.us>> wrote:

I suspect the last big hurdle to making built-in data structures nicely subclassable is the insistence of such types to return new instances as the base class instead of the derived class.

In case that wasn't clear ;)

--> class MyInt(int): ... def __repr__(self): ... return 'MyInt(%d)' % self ... --> m = MyInt(42) --> m MyInt(42) --> m + 1 43 --> type(m+1) <class 'int'>

Besides the work it would take to rectify this, I imagine the biggest hurdle would be the performance hit in always looking up the type of self. Has anyone done any preliminary benchmarking? Are there other concerns?

Actually, the problem is that the base class (e.g. int) doesn't know how to construct an instance of the subclass -- there is no reason (in general) why the signature of a subclass constructor should match the base class constructor, and it often doesn't.

So this is pretty much a no-go. It's not unique to Python -- it's a basic issue with OO.

Really?

...

...
...
class BaseInt: ... def __init__(self, value): ... self._value = value ... def __add__(self, other): ... return type(self)(self._value + other) ... def __repr__(self): ... return '%s(%s)' % (type(self), self._value) ... class MyInt(BaseInt): ... pass ...

m = BaseInt(42) m <class '__main__.BaseInt'>(42) m + 1 <class '__main__.BaseInt'>(43) type(m + 1) <class '__main__.BaseInt'>

m = MyInt(42) m <class '__main__.MyInt'>(42) m + 1 <class '__main__.MyInt'>(43) type(m + 1) <class '__main__.MyInt'>

Ethan Furman

8:14 p.m.

On 02/12/2015 05:46 PM, MRAB wrote:

...

On 2015-02-13 00:55, Guido van Rossum wrote:

...
On Thu, Feb 12, 2015 at 4:36 PM, Ethan Furman <ethan@stoneleaf.us <mailto:ethan@stoneleaf.us>> wrote:

I suspect the last big hurdle to making built-in data structures nicely subclassable is the insistence of such types to return new instances as the base class instead of the derived class.

In case that wasn't clear ;)

--> class MyInt(int): ... def __repr__(self): ... return 'MyInt(%d)' % self ... --> m = MyInt(42) --> m MyInt(42) --> m + 1 43 --> type(m+1) <class 'int'>

Besides the work it would take to rectify this, I imagine the biggest hurdle would be the performance hit in always looking up the type of self. Has anyone done any preliminary benchmarking? Are there other concerns?

Actually, the problem is that the base class (e.g. int) doesn't know how to construct an instance of the subclass -- there is no reason (in general) why the signature of a subclass constructor should match the base class constructor, and it often doesn't.

So this is pretty much a no-go. It's not unique to Python -- it's a basic issue with OO.

Really?

What I was asking about, and Guido responded to, was not having to specifically override __add__, __mul__, __sub__, and all the others; if we do override them then there is no problem. -- ~Ethan~

Steven D'Aprano

8:57 p.m.

On Thu, Feb 12, 2015 at 06:14:22PM -0800, Ethan Furman wrote:

...

On 02/12/2015 05:46 PM, MRAB wrote:

...
On 2015-02-13 00:55, Guido van Rossum wrote:

...

...
...
Actually, the problem is that the base class (e.g. int) doesn't know how to construct an instance of the subclass -- there is no reason (in general) why the signature of a subclass constructor should match the base class constructor, and it often doesn't.

So this is pretty much a no-go. It's not unique to Python -- it's a basic issue with OO.

Really?

What I was asking about, and Guido responded to, was not having to specifically override __add__, __mul__, __sub__, and all the others; if we do override them then there is no problem.

I think you have misunderstood MRAB's comment. My interpretation is that MRAB is suggesting that methods in the base classes should use type(self) rather than hard-coding their own type. E.g. if int were written in pure Python, it might look something like this: class int(object): def __new__(cls, arg): ... def __add__(self, other): return int(self, other) (figuratively, rather than literally). But if it looked like this: def __add__(self, other): return type(self)(self, other) then sub-classing would "just work" without the sub-class having to override each and every method. -- Steve

Ethan Furman

9 p.m.

On 02/12/2015 06:57 PM, Steven D'Aprano wrote:

...

On Thu, Feb 12, 2015 at 06:14:22PM -0800, Ethan Furman wrote:

...
On 02/12/2015 05:46 PM, MRAB wrote:

...
On 2015-02-13 00:55, Guido van Rossum wrote:

...
...
...
Actually, the problem is that the base class (e.g. int) doesn't know how to construct an instance of the subclass -- there is no reason (in general) why the signature of a subclass constructor should match the base class constructor, and it often doesn't.

So this is pretty much a no-go. It's not unique to Python -- it's a basic issue with OO.

Really?

What I was asking about, and Guido responded to, was not having to specifically override __add__, __mul__, __sub__, and all the others; if we do override them then there is no problem.

I think you have misunderstood MRAB's comment. My interpretation is that MRAB is suggesting that methods in the base classes should use type(self) rather than hard-coding their own type.

That makes more sense, thanks. -- ~Ethan~

Chris Angelico

8:40 p.m.

On Fri, Feb 13, 2015 at 12:46 PM, MRAB <python@mrabarnett.plus.com> wrote:

...

...
...
...
class BaseInt: ... def __init__(self, value): ... self._value = value ... def __add__(self, other): ... return type(self)(self._value + other)

On Fri, Feb 13, 2015 at 11:55 AM, Guido van Rossum <guido@python.org> wrote:

...

... there is no reason (in general) why the signature of a subclass constructor should match the base class constructor, and it often doesn't.

You're requiring that any subclass of BaseInt be instantiable with one argument, namely its value. That's requiring that the signature of the subclass constructor match the base class constructor. ChrisA

Mark Roberts

8:59 p.m.

...

On Feb 12, 2015, at 18:40, Chris Angelico <rosuav@gmail.com> wrote:

On Fri, Feb 13, 2015 at 12:46 PM, MRAB <python@mrabarnett.plus.com> wrote:

...
...
...
...
class BaseInt: ... def __init__(self, value): ... self._value = value ... def __add__(self, other): ... return type(self)(self._value + other)

...
On Fri, Feb 13, 2015 at 11:55 AM, Guido van Rossum <guido@python.org> wrote: ... there is no reason (in general) why the signature of a subclass constructor should match the base class constructor, and it often doesn't.

You're requiring that any subclass of BaseInt be instantiable with one argument, namely its value. That's requiring that the signature of the subclass constructor match the base class constructor.

ChrisA _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/wizzat%40gmail.com

No, it seems like he's asking that the type return a new object of the same type instead of one of the superclass. In effect, making the Date class call type(self)(*args) instead of datetime.date(*args). He seems completely willing to accept the consequences of changing the constructor (namely that he will have to override all the methods that call the constructor). It seems like good object oriented design to me. -Mark

Alexander Belopolsky

8:39 p.m.

On Thu, Feb 12, 2015 at 7:55 PM, Guido van Rossum <guido@python.org> wrote:

...

the problem is that the base class (e.g. int) doesn't know how to construct an instance of the subclass -- there is no reason (in general) why the signature of a subclass constructor should match the base class constructor, and it often doesn't.

I hear this explanation every time we have a discussion about subclassing of datetime types and I don't really buy this. Consider this simple subclass:

...

...
...
from datetime import date class Date(date): ... pass ...

What do you think Date.today() should return? Since I did not override todat() in my Date class, it has to be datetime.date instance, right? However:

...

...
...
Date.today().__class__ <class '__main__.Date'>

Wait, Date "doesn't know how to construct an instance of the subclass .." Indeed, if I change the constructor signature, Date.today() won't work:

...

...
...
class Date(date): ... def __init__(self, extra): ... pass ... Date.today() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: __init__() takes exactly 2 arguments (4 given)

In my view, a constructor is no different from any other method. If the designers of the subclass decided to change the signature in an incompatible way, they should either override all methods that create new objects or live with tracebacks. On the other hand, if all I want in my Date class is a better __format__ method, I am forced to override all operators or have my objects silently degrade in situations like this:

...

...
...
d = Date.today() d.__class__ <class '__main__.Date'> d += timedelta(1) d.__class__ <type 'datetime.date'>

Having binary operations return subclass instances is not without precedent. For example, in numpy,

...

...
...
from numpy import ndarray class Array(ndarray): ... pass ... a = Array(1) a[0] = 42 a Array([ 42.]) a + a Array([ 84.])

I believe numpy had this behavior since types became subclassable in Python, so this design is definitely not a "no-go."

Ethan Furman

9:41 p.m.

On 02/12/2015 06:39 PM, Alexander Belopolsky wrote:

...

In my view, a constructor is no different from any other method. If the designers of the subclass decided to change the signature in an incompatible way, they should either override all methods that create new objects or live with tracebacks.

...

On the other hand, if all I want in my Date class is a better __format__ method, I am forced to override all operators or have my objects silently degrade [...]

So there are basically two choices: 1) always use the type of the most-base class when creating new instances pros: - easy - speedy code - no possible tracebacks on new object instantiation cons: - a subclass that needs/wants to maintain itself must override all methods that create new instances, even if the only change is to the type of object returned 2) always use the type of self when creating new instances pros: - subclasses automatically maintain type - much less code in the simple cases [1] cons: - if constructor signatures change, must override all methods which create new objects Unless there are powerful reasons against number 2 (such as performance, or the effort to affect the change), it sure seems like the nicer way to go. So back to my original question: what other concerns are there, and has anybody done any benchmarks? -- ~Ethan~

Guido van Rossum

10:01 p.m.

On Thu, Feb 12, 2015 at 7:41 PM, Ethan Furman <ethan@stoneleaf.us> wrote:

...

On 02/12/2015 06:39 PM, Alexander Belopolsky wrote:

...
In my view, a constructor is no different from any other method. If the designers of the subclass decided to change the signature in an incompatible way, they should either override all methods that create new objects or live with tracebacks.

...
On the other hand, if all I want in my Date class is a better __format__ method, I am forced to override all operators or have my objects silently degrade [...]

So there are basically two choices:

1) always use the type of the most-base class when creating new instances

pros: - easy - speedy code - no possible tracebacks on new object instantiation

cons: - a subclass that needs/wants to maintain itself must override all methods that create new instances, even if the only change is to the type of object returned

2) always use the type of self when creating new instances

pros: - subclasses automatically maintain type - much less code in the simple cases [1]

cons: - if constructor signatures change, must override all methods which create new objects

Unless there are powerful reasons against number 2 (such as performance, or the effort to affect the change), it sure seems like the nicer way to go.

So back to my original question: what other concerns are there, and has anybody done any benchmarks?

Con for #2 is a showstopper. Forget about it. -- --Guido van Rossum (python.org/~guido)

Ethan Furman

10:58 p.m.

On 02/12/2015 08:01 PM, Guido van Rossum wrote:

...

On Thu, Feb 12, 2015 at 7:41 PM, Ethan Furman wrote:

...
2) always use the type of self when creating new instances

cons: - if constructor signatures change, must override all methods which create new objects

Con for #2 is a showstopper. Forget about it.

Happy to, but can you explain why requiring the programmer to override the necessary methods, or get tracebacks, is a showstopper? Is there a previous email thread I can read that discusses it? -- ~Ethan~

Guido van Rossum

13 Feb 13 Feb

11:35 a.m.

On Thu, Feb 12, 2015 at 8:58 PM, Ethan Furman <ethan@stoneleaf.us> wrote:

...

On 02/12/2015 08:01 PM, Guido van Rossum wrote:

...
On Thu, Feb 12, 2015 at 7:41 PM, Ethan Furman wrote:

...
2) always use the type of self when creating new instances

cons: - if constructor signatures change, must override all methods

which

...
create new objects

Con for #2 is a showstopper. Forget about it.

Happy to, but can you explain why requiring the programmer to override the necessary methods, or get tracebacks, is a showstopper? Is there a previous email thread I can read that discusses it?

IIUC you're proposing that the base class should *try* to construct an instance of the subclass by calling the type with an argument, and fail if it doesn't work. But that makes the whole thing brittle in the light of changes to the subclass constructor. Also, what should the argument be? The only answer I can think of is an instance of the base class. Finally, this would require more special-casing in every built-in class (at least every built-in class that sometimes returns instances of itself). -- --Guido van Rossum (python.org/~guido)

Alexander Belopolsky

12:02 p.m.

On Fri, Feb 13, 2015 at 12:35 PM, Guido van Rossum <guido@python.org> wrote:

...

IIUC you're proposing that the base class should *try* to construct an instance of the subclass by calling the type with an argument, and fail if it doesn't work. But that makes the whole thing brittle in the light of changes to the subclass constructor. Also, what should the argument be? The only answer I can think of is an instance of the base class.

No. The arguments should be whatever arguments are appropriate for the baseclass's __init__ or __new__. In the case of datetime.date that would be year, month, day. Note that the original pure python prototype of the datetime module had date.__add__ and friends call self.__class__(year, month, day). Unfortunately, it looks like the original sandbox did not survive the the hg conversion, so I cannot provide a link to the relevant history.

Guido van Rossum

12:11 p.m.

On Fri, Feb 13, 2015 at 10:02 AM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:

...

On Fri, Feb 13, 2015 at 12:35 PM, Guido van Rossum <guido@python.org> wrote:

...
IIUC you're proposing that the base class should *try* to construct an instance of the subclass by calling the type with an argument, and fail if it doesn't work. But that makes the whole thing brittle in the light of changes to the subclass constructor. Also, what should the argument be? The only answer I can think of is an instance of the base class.

No. The arguments should be whatever arguments are appropriate for the baseclass's __init__ or __new__. In the case of datetime.date that would be year, month, day.

Agreed. (I was thinking of the case that Ethan brought up, which used int as an example.)

...

Note that the original pure python prototype of the datetime module had date.__add__ and friends call self.__class__(year, month, day). Unfortunately, it looks like the original sandbox did not survive the the hg conversion, so I cannot provide a link to the relevant history.

FWIW you're wrong when you claim that "a constructor is no different from any other method". Someone else should probably explain this (it's an old argument that's been thoroughly settled). -- --Guido van Rossum (python.org/~guido)

Alexander Belopolsky

12:19 p.m.

On Fri, Feb 13, 2015 at 1:11 PM, Guido van Rossum <guido@python.org> wrote:

...

...
Note that the original pure python prototype of the datetime module had date.__add__ and friends call self.__class__(year, month, day). Unfortunately, it looks like the original sandbox did not survive the the hg conversion, so I cannot provide a link to the relevant history.

FWIW you're wrong when you claim that "a constructor is no different from any other method". Someone else should probably explain this (it's an old argument that's been thoroughly settled).

Well, the best answer I've got in the past [1] was "ask on python-dev since Guido called the operator overriding expectation." :-) [1] http://bugs.python.org/issue2267#msg108060

Alexander Belopolsky

12:22 p.m.

...

...
FWIW you're wrong when you claim that "a constructor is no different

from any other method". Someone else should probably explain this (it's an

On Fri, Feb 13, 2015 at 1:19 PM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote: old argument that's been thoroughly settled).

...

Well, the best answer I've got in the past [1] was "ask on python-dev

since Guido called the operator overriding expectation." :-) And let me repost this bit of history [1]: Here is the annotated pre-r82065 code: 39876 gvanrossum def __add__(self, other): 39876 gvanrossum if isinstance(other, timedelta): 39928 gvanrossum return self.__class__(self.__days + other.__days, 39876 gvanrossum self.__seconds + other.__seconds, 39876 gvanrossum self.__microseconds + other.__microseconds) 40207 tim_one return NotImplemented 39876 gvanrossum [1] http://bugs.python.org/issue2267#msg125979

Guido van Rossum

12:25 p.m.

Are you willing to wait 10 days for an answer? I'm out of round tuits for a while. On Fri, Feb 13, 2015 at 10:22 AM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:

...

...
...
FWIW you're wrong when you claim that "a constructor is no different

from any other method". Someone else should probably explain this (it's an

On Fri, Feb 13, 2015 at 1:19 PM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote: old argument that's been thoroughly settled).

...
Well, the best answer I've got in the past [1] was "ask on python-dev

since Guido called the operator overriding expectation." :-)

And let me repost this bit of history [1]:

Here is the annotated pre-r82065 code:

39876 gvanrossum def __add__(self, other): 39876 gvanrossum if isinstance(other, timedelta): 39928 gvanrossum return self.__class__(self.__days + other.__days, 39876 gvanrossum self.__seconds + other.__seconds, 39876 gvanrossum self.__microseconds + other.__microseconds) 40207 tim_one return NotImplemented 39876 gvanrossum

[1] http://bugs.python.org/issue2267#msg125979

-- --Guido van Rossum (python.org/~guido)

Jonas Wielicki

2:41 a.m.

If I may humbly chime in this, with a hint... On 13.02.2015 05:01, Guido van Rossum wrote:

...

On Thu, Feb 12, 2015 at 7:41 PM, Ethan Furman <ethan@stoneleaf.us> wrote:

...
[snip] 2) always use the type of self when creating new instances

pros: - subclasses automatically maintain type - much less code in the simple cases [1]

cons: - if constructor signatures change, must override all methods which create new objects

Unless there are powerful reasons against number 2 (such as performance, or the effort to affect the change), it sure seems like the nicer way to go.

So back to my original question: what other concerns are there, and has anybody done any benchmarks?

Con for #2 is a showstopper. Forget about it.

I would like to mention that there is another language out there which knows about virtual constructors (virtual like in virtual methods, with signature match requirements and such), which is FreePascal (and Delphi, and I think original Object Pascal too). It is actually a feature I liked about these languages, compared to C++03 and others, that constructors could be virtual and that classes were first-class citizens. Of course, Python cannot check the signature at compile time. But I think as long as it is documented, there should be no reason not to allow and support it. It really is analogous to other methods which need to have a matching signature. just my two cents, jwi

Jonas Wielicki

3:44 a.m.

If I may humbly chime in this, with a hint... On 13.02.2015 05:01, Guido van Rossum wrote:

...

On Thu, Feb 12, 2015 at 7:41 PM, Ethan Furman <ethan@stoneleaf.us> wrote:

...
[snip] 2) always use the type of self when creating new instances

pros: - subclasses automatically maintain type - much less code in the simple cases [1]

cons: - if constructor signatures change, must override all methods which create new objects

Unless there are powerful reasons against number 2 (such as performance, or the effort to affect the change), it sure seems like the nicer way to go.

So back to my original question: what other concerns are there, and has anybody done any benchmarks?

Con for #2 is a showstopper. Forget about it.

Neil Girdhar

4:08 a.m.

With Python's cooperative inheritance, I think you want to do everything through one constructor sending keyword arguments up the chain. The keyword arguments are popped off as needed. With this setup I don't think you need "overloaded constructors". Best, Neil On Fri, Feb 13, 2015 at 4:44 AM, Jonas Wielicki <j.wielicki@sotecware.net> wrote:

...

If I may humbly chime in this, with a hint...

On 13.02.2015 05:01, Guido van Rossum wrote:

...
On Thu, Feb 12, 2015 at 7:41 PM, Ethan Furman <ethan@stoneleaf.us> wrote:

...
[snip] 2) always use the type of self when creating new instances

pros: - subclasses automatically maintain type - much less code in the simple cases [1]

cons: - if constructor signatures change, must override all methods which create new objects

Unless there are powerful reasons against number 2 (such as performance, or the effort to affect the change), it sure seems like the nicer way to go.

So back to my original question: what other concerns are there, and has anybody done any benchmarks?

Con for #2 is a showstopper. Forget about it.

I would like to mention that there is another language out there which knows about virtual constructors (virtual like in virtual methods, with signature match requirements and such), which is FreePascal (and Delphi, and I think original Object Pascal too).

It is actually a feature I liked about these languages, compared to C++03 and others, that constructors could be virtual and that classes were first-class citizens.

Of course, Python cannot check the signature at compile time. But I think as long as it is documented, there should be no reason not to allow and support it. It really is analogous to other methods which need to have a matching signature.

just my two cents, jwi

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/mistersheik%40gmail.com

Ionel Cristian Mărieș

9:58 a.m.

Can we at least make it use the constructor (if there's a custom one)? Seems like a reasonable compromise to me (let whoever implements a custom __new__ deal with argument variance). Eg, make it use a __new__ like this:

...

...
...
class FancyInt(int): ... def __new__(self, value): ... return int.__new__(FancyInt, value) ... ... def __repr__(self): ... return "FancyInt(%s)" % super().__repr__() ... x = FancyInt(1)

x FancyInt(1) x += 1 x # it should be FancyInt(2) 2

Thanks, -- Ionel Cristian Mărieș, blog.ionelmc.ro On Fri, Feb 13, 2015 at 6:01 AM, Guido van Rossum <guido@python.org> wrote:

...

On Thu, Feb 12, 2015 at 7:41 PM, Ethan Furman <ethan@stoneleaf.us> wrote:

...
On 02/12/2015 06:39 PM, Alexander Belopolsky wrote:

...
In my view, a constructor is no different from any other method. If the designers of the subclass decided to change the signature in an incompatible way, they should either override all methods that create new objects or live with tracebacks.

...
On the other hand, if all I want in my Date class is a better __format__ method, I am forced to override all operators or have my objects silently degrade [...]

So there are basically two choices:

1) always use the type of the most-base class when creating new instances

pros: - easy - speedy code - no possible tracebacks on new object instantiation

cons: - a subclass that needs/wants to maintain itself must override all methods that create new instances, even if the only change is to the type of object returned

2) always use the type of self when creating new instances

pros: - subclasses automatically maintain type - much less code in the simple cases [1]

cons: - if constructor signatures change, must override all methods which create new objects

Unless there are powerful reasons against number 2 (such as performance, or the effort to affect the change), it sure seems like the nicer way to go.

So back to my original question: what other concerns are there, and has anybody done any benchmarks?

Con for #2 is a showstopper. Forget about it.

-- --Guido van Rossum (python.org/~guido)

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/contact%40ionelmc.ro

Alexander Belopolsky

10:55 a.m.

On Thu, Feb 12, 2015 at 11:01 PM, Guido van Rossum <guido@python.org> wrote:

...

...
2) always use the type of self when creating new instances .. cons: - if constructor signatures change, must override all methods which create new objects

Con for #2 is a showstopper. Forget about it.

Sorry if I am missing something obvious, but I still don't understand why the same logic does not apply to class methods that create new instances:

...

...
...
from datetime import * date.today() datetime.date(2015, 2, 13) datetime.today() datetime.datetime(2015, 2, 13, 11, 37, 23, 678680) class Date(date): ... pass ... Date.today() Date(2015, 2, 13)

(I actually find datetime.today() returning a datetime rather than a date a questionable design decision, but probably the datetime type should not have been a subclass of the date to begin with.) Are there any date subclasses in the wild that don't accept year, month, day in the constructor? If you create such a class, wouldn't you want to override __add__ and friends anyways? We already know that you will have to override today().

Serhiy Storchaka

4:31 p.m.

On 13.02.15 05:41, Ethan Furman wrote:

...

So there are basically two choices:

1) always use the type of the most-base class when creating new instances

pros: - easy - speedy code - no possible tracebacks on new object instantiation

cons: - a subclass that needs/wants to maintain itself must override all methods that create new instances, even if the only change is to the type of object returned

2) always use the type of self when creating new instances

pros: - subclasses automatically maintain type - much less code in the simple cases [1]

cons: - if constructor signatures change, must override all methods which create new objects

And switching to (2) would break existing code which uses subclasses with constructors with different signature (e.g. defaultdict). The third choice is to use different specially designed constructor. class A(int):

...

...
...
class A(int): ... def __add__(self, other):

... return self.__make_me__(int(self) + int(other)) ... def __repr__(self): ... return 'A(%d)' % self ...

...

...
...
A.__make_me__ = A A(2) + 3 A(5) class B(A): ... def __repr__(self): ... return 'B(%d)' % self ... B.__make_me__ = B B(2) + 3 B(5)

We can add special attribute used to creating results of operations to all basic classes. By default it would be equal to the base class constructor.

Ethan Furman

7:12 p.m.

On 02/13/2015 02:31 PM, Serhiy Storchaka wrote:

...

On 13.02.15 05:41, Ethan Furman wrote:

...
So there are basically two choices:

1) always use the type of the most-base class when creating new instances

pros: - easy - speedy code - no possible tracebacks on new object instantiation

cons: - a subclass that needs/wants to maintain itself must override all methods that create new instances, even if the only change is to the type of object returned

2) always use the type of self when creating new instances

pros: - subclasses automatically maintain type - much less code in the simple cases [1]

cons: - if constructor signatures change, must override all methods which create new objects

And switching to (2) would break existing code which uses subclasses with constructors with different signature (e.g. defaultdict).

I don't think defaultdict is a good example -- I don't see any methods on it that return a new dict, default or otherwise. So if this change happened, defaultdict would have to have its own __add__ and not rely on dict's __add__.

...

The third choice is to use different specially designed constructor.

class A(int):

--> class A(int): ... def __add__(self, other): ... return self.__make_me__(int(self) + int(other))

... def __repr__(self): ... return 'A(%d)' % self

How would this help in the case of defaultdict? __make_me__ is a class method, but it needs instance info to properly create a new dict with the same default factory. -- ~Ethan~

Neil Girdhar

7:16 p.m.

I think it works as Isaac explained if __make_me__ is an instance method that also accepts the calling class type. On Fri, Feb 13, 2015 at 8:12 PM, Ethan Furman <ethan@stoneleaf.us> wrote:

...

On 02/13/2015 02:31 PM, Serhiy Storchaka wrote:

...
On 13.02.15 05:41, Ethan Furman wrote:

...
So there are basically two choices:

1) always use the type of the most-base class when creating new instances

pros: - easy - speedy code - no possible tracebacks on new object instantiation

cons: - a subclass that needs/wants to maintain itself must override all methods that create new instances, even if the only change is to the type of object returned

2) always use the type of self when creating new instances

pros: - subclasses automatically maintain type - much less code in the simple cases [1]

cons: - if constructor signatures change, must override all methods which create new objects

And switching to (2) would break existing code which uses subclasses with constructors with different signature (e.g. defaultdict).

I don't think defaultdict is a good example -- I don't see any methods on it that return a new dict, default or otherwise. So if this change happened, defaultdict would have to have its own __add__ and not rely on dict's __add__.

...
The third choice is to use different specially designed constructor.

class A(int):

--> class A(int): ... def __add__(self, other): ... return self.__make_me__(int(self) + int(other))

... def __repr__(self): ... return 'A(%d)' % self

How would this help in the case of defaultdict? __make_me__ is a class method, but it needs instance info to properly create a new dict with the same default factory.

-- ~Ethan~

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/mistersheik%40gmail.com

Serhiy Storchaka

14 Feb 14 Feb

12:01 a.m.

On 14.02.15 03:12, Ethan Furman wrote:

...

...
The third choice is to use different specially designed constructor.

class A(int):

--> class A(int): ... def __add__(self, other): ... return self.__make_me__(int(self) + int(other))

... def __repr__(self): ... return 'A(%d)' % self

How would this help in the case of defaultdict? __make_me__ is a class method, but it needs instance info to properly create a new dict with the same default factory.

In case of defaultdict (when dict would have to have __add__ and like) either __make_me__ == dict (then defaultdict's methods will return dicts) or it will be instance method. def __make_me__(self, other): return defaultdict(self.default_factory, other)

3556

Age (days ago)

3557

Last active (days ago)

List overview

Download

27 comments

12 participants

participants (12)

Alexander Belopolsky
Chris Angelico
Ethan Furman
Guido van Rossum
Ionel Cristian Mărieș
Jonas Wielicki
Jonas Wielicki
Mark Roberts
MRAB
Neil Girdhar
Serhiy Storchaka
Steven D'Aprano

subclassing builtin data structures

Mark Roberts

tags

participants (12)