Making datetime __str__ and isoformat more consistent

http://bugs.python.org/issue19475 I was told to come here. Is there some reason that datetime objects' __str__ and isoformat methods shouldn't emit microseconds in all cases for consistency? As things stand, datetime.strptime() can't reliably parse what those methods emit. Details on the above (close) ticket. Here's what I'm talkin' 'bout:
Do others agree with me that consistency in this situation is better than the current behavior? Thx, Skip Montanaro

On 11/01/2013 11:15 AM, Skip Montanaro wrote:
I haven't seen the arguments in favor of this awkward behavior, so I may change my mind, but at the moment I would certainly argue for consistency: either emit microseconds in __str__ or ignore remaining microseconds or ignore a trailing %f (or all three ;) . -- ~Ethan~

Thanks for the response. I relented after seeing comments from Guido and Tim. It is highly unlikely that I'd be able to sway the major devs. Modifying __str__ is a complete non-starter, and Guido pushed back a bit on the notion of making isoformat() include microseconds. (He suggested maybe adding an "include microseconds" flag, which I think would be as bad as the current behavior. I will preformat my datetime objects in code that writes them out so I avoid the automatic stringification done by the csv module. Skip

On 11/1/2013 4:29 PM, Skip Montanaro wrote:
Having repr() always include microseconds makes more sense to me. I have no idea what it does since it was not discussed.
(He suggested maybe adding an "include microseconds" flag, which I think would be as bad as the current behavior.
Why? Seems like a good solution to me. -- Terry Jan Reedy

On 01/11/2013 19:49, Ethan Furman wrote:
Suppose there were, say, 900000 microseconds, i.e. 0.9 seconds? If the microseconds aren't shown by __str__, should it truncate or round? When parsing and there's no %f, should it ignore the microseconds, thus effectively truncating, or parse and round if necessary?

On 11/01/2013 01:37 PM, MRAB wrote:
My first (and preferred) option was to emit the microseconds (even if 0). At the heart of this issue is: If datetime.datetime is not going to always emit the microseconds, then there should be an easy roundtrip method to get the value back; currently there is not. Maybe the best fix is to make the csv module smarter about how it outputs datetime's (so it would always emit microseconds). -- ~Ethan~

On 01/11/2013 21:29, Ethan Furman wrote:
The first solution if anything please, the latter is the road to hell. "You've done it for datetime objects, so why not xyz?". "A similar thing was done to the csv module, so why not the ijk module?". -- Python is the second best programming language in the world. But the best has yet to be invented. Christian Tismer Mark Lawrence

On Nov 1, 2013, at 14:29, Ethan Furman <ethan@stoneleaf.us> wrote:
Why should the round trip involve strftime? What about just adding a fromisoformat classmethod constructor that's meant specifically for round tripping isoformat? That means if isoformat sometimes emits microseconds and sometimes doesn't, fromisoformat can take both strings with microseconds and those without. (In fact, there's no reason it couldn't be even more flexible and handle all of the valid variations of ISO format, not just the ones isoformat generates...)

On Fri, Nov 1, 2013 at 8:28 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
What about just adding a fromisoformat classmethod constructor that's meant specifically for round tripping isoformat?
What about following the lead of int(), float(), etc. and allow datetime() take a single string argument?

On Nov 1, 2013, at 17:36, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
I have three (not entirely separate) concerns. The first is that this would imply that ISO is _the_ format for datetimes, rather than just _a_ format. If you ask a normal person for an integer, unless he's got the Super Bowl in the brain, int will parse his input. Ask him for a date, and it's very unlikely it'll be in ISO format. The second problem is false positives. "2003" is a valid ISO date string, equivalent to "20030101" or "20030101T00:00:00". But in most contexts you wouldn't want that interpreted as a valid date or datetime. John Nagle's comment on the tracker (http://bugs.python.org/issue15873#msg169966) explains a similar concern. Finally, I'd be happy with a fromisoformat that _only_ handled the output from the isoformat function, but a general constructor would seem incomplete unless it handled all of ISO 8601, or at least all of RFC 3339. That isn't exactly _hard_ (the RFC was meant to be implementable, after all), but it's a higher bar than needed for the problem that started this thread.

On Fri, Nov 1, 2013 at 9:52 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
I don't see how this is a problem. *The* format for Python datetime is the output of print:
It also happened to be ISO compliant.
I would start with date/datetime() just accepting str(x) as input for any date/datetime instance. Full-featured ISO-compliant date/time parsing is better left for third-party packages.
As long as the general constructor can handle str() output it seems to be complete to me.

On 2 November 2013 12:49, MRAB <python@mrabarnett.plus.com> wrote:
It's been a while since I had to deal seriously with date parsing, but at the time, emitting microseconds was a fairly surefire way to break most utilities that nominally supported date parsing. Roundtripping is good, but interoperability is important too, and as far as I am aware, microsecond support when parsing is still sketchy with many date parsing tools. Ensuring that emitting and consuming microseconds is easy would definitely be a good thing, but unless general date parsing support (not just in Python, but in programming utilities in general) has improved more dramatically in recent years than I believe it has, emitting microseconds by default would be a backwards compatibility breach. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 11/02/2013 02:25 AM, Nick Coghlan wrote:
The thread seems to be leaning towards leaving the current __str__ behavior as-is, and simply adding enough smarts to __new__ to be able to reconvert back to a date/datetime instance (whether or not microseconds have been emitted). -- ~Ethan~

On 02/11/2013 19:43, Ethan Furman wrote:
The OP was using strptime, so perhaps the simplest solution would be to allow its 'format' argument to default to None, which would mean ISO format with optional microseconds.

On 2013-11-02, at 02:52 , Andrew Barnert <abarnert@yahoo.com> wrote:
Right, now try that with a float and watch `float(s)` blow up when most europeans give you `4,63` or something along those lines. Or ask for a large integer, and notice that int really does not like decimal separators.

On 11/01/2013 11:15 AM, Skip Montanaro wrote:
I haven't seen the arguments in favor of this awkward behavior, so I may change my mind, but at the moment I would certainly argue for consistency: either emit microseconds in __str__ or ignore remaining microseconds or ignore a trailing %f (or all three ;) . -- ~Ethan~

Thanks for the response. I relented after seeing comments from Guido and Tim. It is highly unlikely that I'd be able to sway the major devs. Modifying __str__ is a complete non-starter, and Guido pushed back a bit on the notion of making isoformat() include microseconds. (He suggested maybe adding an "include microseconds" flag, which I think would be as bad as the current behavior. I will preformat my datetime objects in code that writes them out so I avoid the automatic stringification done by the csv module. Skip

On 11/1/2013 4:29 PM, Skip Montanaro wrote:
Having repr() always include microseconds makes more sense to me. I have no idea what it does since it was not discussed.
(He suggested maybe adding an "include microseconds" flag, which I think would be as bad as the current behavior.
Why? Seems like a good solution to me. -- Terry Jan Reedy

On 01/11/2013 19:49, Ethan Furman wrote:
Suppose there were, say, 900000 microseconds, i.e. 0.9 seconds? If the microseconds aren't shown by __str__, should it truncate or round? When parsing and there's no %f, should it ignore the microseconds, thus effectively truncating, or parse and round if necessary?

On 11/01/2013 01:37 PM, MRAB wrote:
My first (and preferred) option was to emit the microseconds (even if 0). At the heart of this issue is: If datetime.datetime is not going to always emit the microseconds, then there should be an easy roundtrip method to get the value back; currently there is not. Maybe the best fix is to make the csv module smarter about how it outputs datetime's (so it would always emit microseconds). -- ~Ethan~

On 01/11/2013 21:29, Ethan Furman wrote:
The first solution if anything please, the latter is the road to hell. "You've done it for datetime objects, so why not xyz?". "A similar thing was done to the csv module, so why not the ijk module?". -- Python is the second best programming language in the world. But the best has yet to be invented. Christian Tismer Mark Lawrence

On Nov 1, 2013, at 14:29, Ethan Furman <ethan@stoneleaf.us> wrote:
Why should the round trip involve strftime? What about just adding a fromisoformat classmethod constructor that's meant specifically for round tripping isoformat? That means if isoformat sometimes emits microseconds and sometimes doesn't, fromisoformat can take both strings with microseconds and those without. (In fact, there's no reason it couldn't be even more flexible and handle all of the valid variations of ISO format, not just the ones isoformat generates...)

On Fri, Nov 1, 2013 at 8:28 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
What about just adding a fromisoformat classmethod constructor that's meant specifically for round tripping isoformat?
What about following the lead of int(), float(), etc. and allow datetime() take a single string argument?

On Nov 1, 2013, at 17:36, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
I have three (not entirely separate) concerns. The first is that this would imply that ISO is _the_ format for datetimes, rather than just _a_ format. If you ask a normal person for an integer, unless he's got the Super Bowl in the brain, int will parse his input. Ask him for a date, and it's very unlikely it'll be in ISO format. The second problem is false positives. "2003" is a valid ISO date string, equivalent to "20030101" or "20030101T00:00:00". But in most contexts you wouldn't want that interpreted as a valid date or datetime. John Nagle's comment on the tracker (http://bugs.python.org/issue15873#msg169966) explains a similar concern. Finally, I'd be happy with a fromisoformat that _only_ handled the output from the isoformat function, but a general constructor would seem incomplete unless it handled all of ISO 8601, or at least all of RFC 3339. That isn't exactly _hard_ (the RFC was meant to be implementable, after all), but it's a higher bar than needed for the problem that started this thread.

On Fri, Nov 1, 2013 at 9:52 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
I don't see how this is a problem. *The* format for Python datetime is the output of print:
It also happened to be ISO compliant.
I would start with date/datetime() just accepting str(x) as input for any date/datetime instance. Full-featured ISO-compliant date/time parsing is better left for third-party packages.
As long as the general constructor can handle str() output it seems to be complete to me.

On 2 November 2013 12:49, MRAB <python@mrabarnett.plus.com> wrote:
It's been a while since I had to deal seriously with date parsing, but at the time, emitting microseconds was a fairly surefire way to break most utilities that nominally supported date parsing. Roundtripping is good, but interoperability is important too, and as far as I am aware, microsecond support when parsing is still sketchy with many date parsing tools. Ensuring that emitting and consuming microseconds is easy would definitely be a good thing, but unless general date parsing support (not just in Python, but in programming utilities in general) has improved more dramatically in recent years than I believe it has, emitting microseconds by default would be a backwards compatibility breach. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 11/02/2013 02:25 AM, Nick Coghlan wrote:
The thread seems to be leaning towards leaving the current __str__ behavior as-is, and simply adding enough smarts to __new__ to be able to reconvert back to a date/datetime instance (whether or not microseconds have been emitted). -- ~Ethan~

On 02/11/2013 19:43, Ethan Furman wrote:
The OP was using strptime, so perhaps the simplest solution would be to allow its 'format' argument to default to None, which would mean ISO format with optional microseconds.

On 2013-11-02, at 02:52 , Andrew Barnert <abarnert@yahoo.com> wrote:
Right, now try that with a float and watch `float(s)` blow up when most europeans give you `4,63` or something along those lines. Or ask for a large integer, and notice that int really does not like decimal separators.

On 2013-11-02, at 14:13 , Masklinn <masklinn@masklinn.net> wrote:
Or ask for a large integer, and notice that int really does not like decimal separators.
(and by “decimal” I meant “thousands”, sorry about that)
participants (10)
-
Alexander Belopolsky
-
Andrew Barnert
-
Eric V. Smith
-
Ethan Furman
-
Mark Lawrence
-
Masklinn
-
MRAB
-
Nick Coghlan
-
Skip Montanaro
-
Terry Reedy