Return type of datetime subclasses added to timedelta

Happy New Year everyone! I would like to start a thread here for wider feedback on my proposal to change the return type of the addition operation between a datetime subclass and a timedelta. Currently, adding a timedelta to a subclass of datetime /always/ returns a datetime rather than an instance of the datetime subclass. I have an open PR implementing this, PR #10902 <https://github.com/python/cpython/pull/10902>, but I know it's a major change so I did not want to move forward without more discussion. I first brought this up on datetime-SIG <https://mail.python.org/archives/list/datetime-sig@python.org/thread/TGB3VZS...> [1], and we decided to move the discussion over here because the people most likely to object to the change would be on this list and not on datetime-SIG. In addition to the datetime-SIG thread, you may find a detailed rationale for the change in bpo-35364 <https://bugs.python.org/issue35364#msg331065> [2], and a rationale for why we would want to (and arguably already /do/) support subclassing datetime in bpo-32417 <https://bugs.python.org/issue32417#msg331353> [3]. A short version of the strongest rationale for changing how this works is that it is causing inconsistencies in how subclassing is handled in alternate constructors of datetime. For a given subclass of datetime (which I will call DateTimeSub), nearly all alternate constructors already support subclasses correctly - DateTimeSub.fromtimestamp(x) will return a DateTimeSub, for example. However, because DateTimeSub + timedelta returns datetime, any alternate constructor implemented in terms of timedelta additions will leak that implementation detail by returning a datetime object instead of the subclass. The biggest problem is that datetime.fromutc is defined in terms of timedelta addition, so DateTimeSub.now() returns a DateTimeSub object, but DateTimeSub.now(timezone.utc) returns a datetime object! This is one of the most annoying things to work around when building a datetime subclass, and I don't know of any situation where someone /wants/ their subclass to be lost on addition with a timedelta. From my understanding, this has been discussed before and the original objection was that this implementation assumes that the datetime subclass has a constructor with the same (or a sufficiently similar) signature as datetime. This may be a legitimate gripe, but unfortunately that ship has sailed long ago. All of datetime's alternate constructors make this assumption. Any subclass that does not meet this requirement must have worked around it long ago (or they don't care about alternate constructors). Thanks for your attention, I look forward to your replies. Best, Paul [1] https://mail.python.org/archives/list/datetime-sig@python.org/thread/TGB3VZS... [2] https://bugs.python.org/issue35364#msg331065 [3] https://bugs.python.org/issue32417#msg331353

On Wed, Jan 2, 2019 at 10:18 PM Paul Ganssle <paul@ganssle.io> wrote:
While this was used as a possible rationale for the way standard types behave, the main objection to changing datetime classes is that it will make them behave differently from builtins. For example:
This may be a legitimate gripe, but unfortunately that ship has sailed long
This is right, but the same argument is equally applicable to int, float, etc. subclasses. If you want to limit your change to datetime types you should explain what makes these types special.

I can think of many reasons why datetime is different from builtins, though to be honest I'm not sure that consistency for its own sake is really a strong argument for keeping a counter-intuitive behavior - and to be honest I'm open to the idea that /all/ arithmetic types /should/ have some form of this change. That said, I would say that the biggest difference between datetime and builtins (other than the fact that datetime is /not/ a builtin, and as such doesn't necessarily need to be categorized in this group), is that unlike almost all other arithmetic types, /datetime/ has a special, dedicated type for describing differences in datetimes. Using your example of a float subclass, consider that without the behavior of "addition of floats returns floats", it would be hard to predict what would happen in this situation:
F(1.2) + 3.4
Would that always return a float, even though F(1.2) + F(3.4) returns an F? Would that return an F because F is the left-hand operand? Would it return a float because float is the right-hand operand? Would you walk the MROs and find the lowest type in common between the operands and return that? It's not entirely clear which subtype predominates. With datetime, you have: datetime - datetime -> timedelta datetime ± timedelta -> datetime timedelta ± timedelta -> timedelta There's no operation between two datetime objects that would return a datetime object, so it's always clear: operations between datetime subclasses return timedelta, operations between a datetime object and a timedelta return the subclass of the datetime that it was added to or subtracted from. Of course, the real way to resolve whether datetime should be different from int/float/string/etc is to look at why this choice was actually made for those types in the first place, and decide whether datetime is like them /in this respect/. The heterogeneous operations problem may be a reasonable justification for leaving the other builtins alone but changing datetime, but if someone knows of other fundamental reasons why the decision to have arithmetic operations always create the base class was chosen, please let me know. Best, Paul On 1/5/19 3:55 AM, Alexander Belopolsky wrote:

From my perspective datetime classes are even more complex than int/float. Let's assume we have
class DT(datetime.datetime): ... class TD(datetime.timedelta): ... What is the result type for the following expressions? DT - datetime DT - DT DT + TD DT + timedelta I have a feeling that the question has no generic answer. For *particular* implementation you can override all __add__, __sub__ and other arithmetic operations, and you can do it right now with the current datetime module implementation. P.S. I think inheritance from datetime classes is a very rare thing, 99.99% of users don't need it. On Sun, Jan 6, 2019 at 6:03 PM Paul Ganssle <paul@ganssle.io> wrote:
-- Thanks, Andrew Svetlov

On 1/6/19 1:29 PM, Andrew Svetlov wrote:
It is not really complicated, the default "difference between two datetimes" returns a `timedelta`, you can change that by overriding `__sub__` or `__rsub__` as desired, but there's no reason to think that the fact that just because DT is a subclass of datetime that it would be coupled to a specific timedelta subclass *by default*. Similarly, DT + TD by default will do whatever "datetime" and "timedelta" do unless you specifically override them. In my proposal, adding some time to a datetime subclass would return an object of the datetime subclass, so unless __radd__ or __rsub__ were overriden in `timedelta`, that's what would happen, the defaults would be (sensibly): DT - datetime -> timedelta DT - DT -> timedelta DT + TD -> DT DT + timedelta -> timedelta The only time it would be more complicated is if datetime were defined like this: class datetime: TIMEDELTA_CLASS = datetime.timedelta ... In which case you'd have the same problem you have with float/int/etc (not a particularly more complicated one. But that's not the case, and there /is/ one obviously right answer. This is not the case with float subclasses, because the intuitive rule is "adding together two objects of the same class gives the same class", which fails when you have two different subclasses. With datetime, you have "adding a delta type to a value type returns an object of the value type", which makes perfect sense, as opposed to "adding a delta type to a value type returns the base value type, even if the base value type was never used".
Both of these points are addressed in my original post, IIRC, but both of these arguments cut both ways. Assuming it's true that this is very rare - the 0.01% of people who /are/ subclassing datetime either don't care about this behavior or want timedelta arithmetic to return their subclass. It's rare enough that there should be no problem giving them what they want. Similarly, the rarest group - people who are creating datetime subclasses /and/ want the original behavior - can simply implement __add__ and __sub__ to get what they want, so there's no real conflict, it's just a matter of setting a sane default that also solves the problem that datetime alternate constructors tend to leak their implementation details because of the arithmetic return type issue. Best, Paul

I don't think datetime and builtins like int necessarily need to be aligned. But I do see a problem -- the __new__ and __init__ methods defined in the subclass (if any) should allow for being called with the same signature as the base datetime class. Currently you can have a subclass of datetime whose __new__ has no arguments (or, more realistically, interprets its arguments differently). Instances of such a class can still be added to a timedelta. The proposal would cause this to break (since such an addition has to create a new instance, which calls __new__ and __init__). Since this is a backwards incompatibility, I don't see how it can be done -- and I also don't see many use cases, so I think it's not worth pursuing further. Note that the same problem already happens with the .fromordinal() class method, though it doesn't happen with .fromdatetime() or .now():
On Sun, Jan 6, 2019 at 9:05 AM Paul Ganssle <paul@ganssle.io> wrote:
-- --Guido van Rossum (python.org/~guido)

I did address this in the original post - the assumption that the subclass constructor will have the same arguments as the base constructor is baked into many alternate constructors of datetime. I acknowledge that this is a breaking change, but it is a small one - anyone creating such a subclass that /cannot/ handled the class being created this way would be broken in myriad ways. We have also in recent years changed several alternate constructors (including `replace`) to retain the original subclass, which by your same standard would be a breaking change. I believe there have been no complaints. In fact, between Python 3.6 and 3.7, the very example you showed broke: Python 3.6.6:
Python 3.7.2:
We haven't seen any bug reports about this sort of thing; what we /have/ been getting is bug reports that subclassing datetime doesn't retain the subclass in various ways (because people /are/ using datetime subclasses). This is likely to cause very little in the way of problems, but it will improve convenience for people making datetime subclasses and almost certainly performance for people using them (e.g. pendulum and arrow, which now need to take a slow pure python route in many situations to work around this problem). If we're /really/ concerned with this backward compatibility breaking, we could do the equivalent of: try: return new_behavior(...) except TypeError: warnings.warn("The semantics of timedelta addition have " "changed in a way that raises an error in " "this subclass. Please implement __add__ " "if you need the old behavior.", DeprecationWarning) Then after a suitable notice period drop the warning and turn it to a hard error. Best, Paul On 1/6/19 1:43 PM, Guido van Rossum wrote:

On Sun, 6 Jan 2019 at 11:00, Paul Ganssle <paul@ganssle.io> wrote:
To help set expectations, the current semantics are not a bug and so the proposal isn't fixing a bug but proposing a change in semantics.
We very much do care. Because this isn't a bug but a voluntary semantic change you're proposing to change we can't blindly break people who are relying on the current semantics. We need to have a justification for those people as to why we have decided to change the semantics now after all of these years as well as provide an upgrade path. -Brett

Brett, Thank you for bringing this up, but I think you /may/ have misunderstood my position - though maybe you understood the thrust and wanted to clarify for people coming in halfway, which I applaud. I proposed this change /knowing/ that it was a breaking change - it's why I brought it to the attention of datetime-SIG and now python-dev - and I believe that there are several factors that lead this to being a smaller compatibility problem than it seems. One such factor is the fact that /many/ other features of `datetime`, including the implementation of `datetime.now()` are /already broken/ in the current implementation for anyone who would be broken by this particular aspect of the semantic change. That is not saying that it's impossible that there is code out there that will break if this change goes through, it's just saying that the scope of the breakage is necessarily very limited. The reason I brought up the bug tracker is because between Python 3.6 and Python 3.7, we in fact made a similar breaking change to the one I'm proposing here without thinking that anyone might be relying on the fact that they could do something like: class D(datetime.datetime): def __new__(cls): return cls.now() My point was that there have been no bug reports about the /existing change/ that Guido was bringing up (his example itself does not work on Python 3.7!), which leads me to believe that few if any people are relying on the fact that it is possible to define a datetime subclass with a different default constructor. As I mentioned, it is likely possible to have a transition period where this would still work even if the subclassers have not created their own __add__ method. There is no way to create a similar deprecation/transition period for people relying on the fact that `type(datetime_obj + timedelta_obj) == datetime.datetime`, but I think this is honestly a sufficiently minor breakage that the good outweighs the harm. I will note that we have already made several such changes with respect to alternate constructors even though technically someone could have been relying on the fact that `MyDateTime(*args).replace(month=3)` returns a `datetime` object. This is not to say that we should lightly make the change (hence my canvassing for opinions), it is just that there is a good amount of evidence that, practically speaking, no one is relying on this, and in fact it is likely that people are writing code that assumes that adding `timedelta` to a datetime subclass returns the original subclass, either directly or indirectly - I think we're likely to fix more people than we break if we make this change. Best, Paul On 1/6/19 3:24 PM, Brett Cannon wrote:

Hey all, This thread about the return type of datetime operations seems to have stopped without any explicit decision - I think I responded to everyone who had objections, but I think only Guido has given a +1 to whether or not we should go ahead. Have we got agreement to go ahead with this change? Are we still targeting Python 3.8 here? For those who don't want to dig through your old e-mails, here's the archive link for this thread: https://mail.python.org/pipermail/python-dev/2019-January/155984.html If you want to start commenting on the actual implementation, it's available here (though it's pretty simple): https://github.com/python/cpython/pull/10902 Best, Paul On 1/6/19 7:17 PM, Guido van Rossum wrote:

There's already a PR, actually, #10902: https://github.com/python/cpython/pull/10902 Victor reviewed and approved it, I think before I started this thread, so now it's just waiting on merge. On 2/4/19 11:38 AM, Guido van Rossum wrote:

On Wed, Jan 2, 2019 at 10:18 PM Paul Ganssle <paul@ganssle.io> wrote:
While this was used as a possible rationale for the way standard types behave, the main objection to changing datetime classes is that it will make them behave differently from builtins. For example:
This may be a legitimate gripe, but unfortunately that ship has sailed long
This is right, but the same argument is equally applicable to int, float, etc. subclasses. If you want to limit your change to datetime types you should explain what makes these types special.

I can think of many reasons why datetime is different from builtins, though to be honest I'm not sure that consistency for its own sake is really a strong argument for keeping a counter-intuitive behavior - and to be honest I'm open to the idea that /all/ arithmetic types /should/ have some form of this change. That said, I would say that the biggest difference between datetime and builtins (other than the fact that datetime is /not/ a builtin, and as such doesn't necessarily need to be categorized in this group), is that unlike almost all other arithmetic types, /datetime/ has a special, dedicated type for describing differences in datetimes. Using your example of a float subclass, consider that without the behavior of "addition of floats returns floats", it would be hard to predict what would happen in this situation:
F(1.2) + 3.4
Would that always return a float, even though F(1.2) + F(3.4) returns an F? Would that return an F because F is the left-hand operand? Would it return a float because float is the right-hand operand? Would you walk the MROs and find the lowest type in common between the operands and return that? It's not entirely clear which subtype predominates. With datetime, you have: datetime - datetime -> timedelta datetime ± timedelta -> datetime timedelta ± timedelta -> timedelta There's no operation between two datetime objects that would return a datetime object, so it's always clear: operations between datetime subclasses return timedelta, operations between a datetime object and a timedelta return the subclass of the datetime that it was added to or subtracted from. Of course, the real way to resolve whether datetime should be different from int/float/string/etc is to look at why this choice was actually made for those types in the first place, and decide whether datetime is like them /in this respect/. The heterogeneous operations problem may be a reasonable justification for leaving the other builtins alone but changing datetime, but if someone knows of other fundamental reasons why the decision to have arithmetic operations always create the base class was chosen, please let me know. Best, Paul On 1/5/19 3:55 AM, Alexander Belopolsky wrote:

From my perspective datetime classes are even more complex than int/float. Let's assume we have
class DT(datetime.datetime): ... class TD(datetime.timedelta): ... What is the result type for the following expressions? DT - datetime DT - DT DT + TD DT + timedelta I have a feeling that the question has no generic answer. For *particular* implementation you can override all __add__, __sub__ and other arithmetic operations, and you can do it right now with the current datetime module implementation. P.S. I think inheritance from datetime classes is a very rare thing, 99.99% of users don't need it. On Sun, Jan 6, 2019 at 6:03 PM Paul Ganssle <paul@ganssle.io> wrote:
-- Thanks, Andrew Svetlov

On 1/6/19 1:29 PM, Andrew Svetlov wrote:
It is not really complicated, the default "difference between two datetimes" returns a `timedelta`, you can change that by overriding `__sub__` or `__rsub__` as desired, but there's no reason to think that the fact that just because DT is a subclass of datetime that it would be coupled to a specific timedelta subclass *by default*. Similarly, DT + TD by default will do whatever "datetime" and "timedelta" do unless you specifically override them. In my proposal, adding some time to a datetime subclass would return an object of the datetime subclass, so unless __radd__ or __rsub__ were overriden in `timedelta`, that's what would happen, the defaults would be (sensibly): DT - datetime -> timedelta DT - DT -> timedelta DT + TD -> DT DT + timedelta -> timedelta The only time it would be more complicated is if datetime were defined like this: class datetime: TIMEDELTA_CLASS = datetime.timedelta ... In which case you'd have the same problem you have with float/int/etc (not a particularly more complicated one. But that's not the case, and there /is/ one obviously right answer. This is not the case with float subclasses, because the intuitive rule is "adding together two objects of the same class gives the same class", which fails when you have two different subclasses. With datetime, you have "adding a delta type to a value type returns an object of the value type", which makes perfect sense, as opposed to "adding a delta type to a value type returns the base value type, even if the base value type was never used".
Both of these points are addressed in my original post, IIRC, but both of these arguments cut both ways. Assuming it's true that this is very rare - the 0.01% of people who /are/ subclassing datetime either don't care about this behavior or want timedelta arithmetic to return their subclass. It's rare enough that there should be no problem giving them what they want. Similarly, the rarest group - people who are creating datetime subclasses /and/ want the original behavior - can simply implement __add__ and __sub__ to get what they want, so there's no real conflict, it's just a matter of setting a sane default that also solves the problem that datetime alternate constructors tend to leak their implementation details because of the arithmetic return type issue. Best, Paul

I don't think datetime and builtins like int necessarily need to be aligned. But I do see a problem -- the __new__ and __init__ methods defined in the subclass (if any) should allow for being called with the same signature as the base datetime class. Currently you can have a subclass of datetime whose __new__ has no arguments (or, more realistically, interprets its arguments differently). Instances of such a class can still be added to a timedelta. The proposal would cause this to break (since such an addition has to create a new instance, which calls __new__ and __init__). Since this is a backwards incompatibility, I don't see how it can be done -- and I also don't see many use cases, so I think it's not worth pursuing further. Note that the same problem already happens with the .fromordinal() class method, though it doesn't happen with .fromdatetime() or .now():
On Sun, Jan 6, 2019 at 9:05 AM Paul Ganssle <paul@ganssle.io> wrote:
-- --Guido van Rossum (python.org/~guido)

I did address this in the original post - the assumption that the subclass constructor will have the same arguments as the base constructor is baked into many alternate constructors of datetime. I acknowledge that this is a breaking change, but it is a small one - anyone creating such a subclass that /cannot/ handled the class being created this way would be broken in myriad ways. We have also in recent years changed several alternate constructors (including `replace`) to retain the original subclass, which by your same standard would be a breaking change. I believe there have been no complaints. In fact, between Python 3.6 and 3.7, the very example you showed broke: Python 3.6.6:
Python 3.7.2:
We haven't seen any bug reports about this sort of thing; what we /have/ been getting is bug reports that subclassing datetime doesn't retain the subclass in various ways (because people /are/ using datetime subclasses). This is likely to cause very little in the way of problems, but it will improve convenience for people making datetime subclasses and almost certainly performance for people using them (e.g. pendulum and arrow, which now need to take a slow pure python route in many situations to work around this problem). If we're /really/ concerned with this backward compatibility breaking, we could do the equivalent of: try: return new_behavior(...) except TypeError: warnings.warn("The semantics of timedelta addition have " "changed in a way that raises an error in " "this subclass. Please implement __add__ " "if you need the old behavior.", DeprecationWarning) Then after a suitable notice period drop the warning and turn it to a hard error. Best, Paul On 1/6/19 1:43 PM, Guido van Rossum wrote:

On Sun, 6 Jan 2019 at 11:00, Paul Ganssle <paul@ganssle.io> wrote:
To help set expectations, the current semantics are not a bug and so the proposal isn't fixing a bug but proposing a change in semantics.
We very much do care. Because this isn't a bug but a voluntary semantic change you're proposing to change we can't blindly break people who are relying on the current semantics. We need to have a justification for those people as to why we have decided to change the semantics now after all of these years as well as provide an upgrade path. -Brett

Brett, Thank you for bringing this up, but I think you /may/ have misunderstood my position - though maybe you understood the thrust and wanted to clarify for people coming in halfway, which I applaud. I proposed this change /knowing/ that it was a breaking change - it's why I brought it to the attention of datetime-SIG and now python-dev - and I believe that there are several factors that lead this to being a smaller compatibility problem than it seems. One such factor is the fact that /many/ other features of `datetime`, including the implementation of `datetime.now()` are /already broken/ in the current implementation for anyone who would be broken by this particular aspect of the semantic change. That is not saying that it's impossible that there is code out there that will break if this change goes through, it's just saying that the scope of the breakage is necessarily very limited. The reason I brought up the bug tracker is because between Python 3.6 and Python 3.7, we in fact made a similar breaking change to the one I'm proposing here without thinking that anyone might be relying on the fact that they could do something like: class D(datetime.datetime): def __new__(cls): return cls.now() My point was that there have been no bug reports about the /existing change/ that Guido was bringing up (his example itself does not work on Python 3.7!), which leads me to believe that few if any people are relying on the fact that it is possible to define a datetime subclass with a different default constructor. As I mentioned, it is likely possible to have a transition period where this would still work even if the subclassers have not created their own __add__ method. There is no way to create a similar deprecation/transition period for people relying on the fact that `type(datetime_obj + timedelta_obj) == datetime.datetime`, but I think this is honestly a sufficiently minor breakage that the good outweighs the harm. I will note that we have already made several such changes with respect to alternate constructors even though technically someone could have been relying on the fact that `MyDateTime(*args).replace(month=3)` returns a `datetime` object. This is not to say that we should lightly make the change (hence my canvassing for opinions), it is just that there is a good amount of evidence that, practically speaking, no one is relying on this, and in fact it is likely that people are writing code that assumes that adding `timedelta` to a datetime subclass returns the original subclass, either directly or indirectly - I think we're likely to fix more people than we break if we make this change. Best, Paul On 1/6/19 3:24 PM, Brett Cannon wrote:

Hey all, This thread about the return type of datetime operations seems to have stopped without any explicit decision - I think I responded to everyone who had objections, but I think only Guido has given a +1 to whether or not we should go ahead. Have we got agreement to go ahead with this change? Are we still targeting Python 3.8 here? For those who don't want to dig through your old e-mails, here's the archive link for this thread: https://mail.python.org/pipermail/python-dev/2019-January/155984.html If you want to start commenting on the actual implementation, it's available here (though it's pretty simple): https://github.com/python/cpython/pull/10902 Best, Paul On 1/6/19 7:17 PM, Guido van Rossum wrote:

There's already a PR, actually, #10902: https://github.com/python/cpython/pull/10902 Victor reviewed and approved it, I think before I started this thread, so now it's just waiting on merge. On 2/4/19 11:38 AM, Guido van Rossum wrote:
participants (5)
-
Alexander Belopolsky
-
Andrew Svetlov
-
Brett Cannon
-
Guido van Rossum
-
Paul Ganssle