I'm slightly leaning towards %:z because changing the semantics of %z could be construed as a backwards-incompatible change (albeit a minor one). I know some people have been asking for a "strict" version of the dateutil parser, and people do tend to use parsers for string validation. Adding the %:z option has the advantage that it's unambiguously backwards compatible, and it can be added to strftime if that is deemed desirable.
Back to the subject of how to handle +-HH:MM, I think the only really viable candidates are %z and %:z, so I think the question boils down to whether, with strptime, we care more about consistency with GNU / glibc's strptime (which apparently do implement %z to cover both HHMM and HH:MM) or whether we care more about users being able to specific *exactly* the string they want to match (e.g. allowing users to specify that a colon found in a time zone offset is an error condition).
I'm slightly leaning towards %:z because changing the semantics of %z could be construed as a backwards-incompatible change (albeit a minor one). I know some people have been asking for a "strict" version of the dateutil parser, and people do tend to use parsers for string validation. Adding the %:z option has the advantage that it's unambiguously backwards compatible, and it can be added to strftime if that is deemed desirable.
Best,
Paul
On 10/21/2017 09:07 AM, Mario Corchero wrote:
> Sorry, hit send by mistake on the previous message.
>
> That is fine for parsing, but my issue with this is symmetry with strftime.
>
>
> I can agree with having a %:z for support in strftime but I think that is a
> separate change. The issue I opened with the attached PR focused only in
> strptime to facilitate the discussion.
>
> Again, what is the alternative?
>
>
> Making %z accept time-offset rfc3339 compatible.
>
> I have a working strptime:
>
>
> Ouch, except for the fractionals seconds (which was not part of the issue
> raised) I had also a patch for the colon and another for supporting 'Z' as
> reported in the bug tracker. I was mentioning working with Paul in the
> implementation of isoparse, as even if it might look simple it has caused
> many long-standing discussions in the past.
>
> On 21 October 2017 at 13:55, Mario Corchero <mariocj89@gmail.com> wrote:
>
>>
>>
>> On 21 October 2017 at 13:18, Oren Tirosh <orent@hishome.net> wrote:
>>
>>>
>>> On Sat, 21 Oct 2017 at 13:24, Mario Corchero <mariocj89@gmail.com> wrote:
>>>
>>>> My opinion (as a user, I have no authority here whatsoever)
>>>>
>>>> *1) About parsing colons in offsets with strptime*
>>>>
>>>> I think having %z support both +-HH:MM and +-HHMM would be the best
>>>> choice, as it seems the simplest for me as a user.
>>>> I'd go even further, making %z support ':' and 'Z', *a la glibc*.
>>>> This effectively means that %z can now parse: Z, ±hh:mm, ±hhmm, or ±hh
>>>>
>>>
>>> That is fine for parsing, but my issue with this is symmetry with
>>> strftime. If the same extensions are also implemented for formatting (I
>>> have a prototype) then you need some way to specify whether you want a :
>>> separator or not. The %z will have to remain without colon on formatting
>>> for backward compatibility.
>>>
>>> So l agree that the parser can be safely made more liberal in what it
>>> accepts, but the formatter must be strict and specific in what it produces.
>>>
>>> I think this gives the best experience to the strptime user. It
>>>> basically makes the time-offset rfc3339
>>>> <https://tools.ietf.org/html/rfc3339 > compatible.
>>>>
>>>
>>> Yes, that's the goal.
>>>
>>> *2) Adding a handy function to build a datetime from a string serialized
>>>> with isoformat*
>>>> Absolutely agree on having an isoparse. That would be amazing, we can
>>>> even build it on top of 1).
>>>>
>>>
>>> ...and building it on top of 1 requires several extensions and variants.
>>> People here seem to be a bit taken aback by the scope of these extensions.
>>> I understand this reaction, but I maintain that most or all this complexity
>>> is necessary if you want to implement this on to of strptime rather than a
>>> custom isoparse().
>>>
>>> *Side note:*
>>>> I am not totally in favour with "%?:z" (probably because I am leaning
>>>> on %z doing the parsing for both and ?z will have no place on strftime).
>>>> I think this starts to add way too much complexity to just say "parse a
>>>> time-offset".
>>>>
>>>
>>> Again, what is the alternative? If you want a parser that accepts the
>>> output of isoformat() for all possible datetime values (except custom
>>> tzinfo) then it needs to support a missing tz offset as indicating a naive
>>> timestamp.
>>>
>>> You can say that the real source of the asymmetry here is not with my
>>> proposal but rather in the underlying strftime/strptime: on formatting, %z
>>> yields an empty string for a naive timestamp rather that producing an
>>> error. But on parsing, it refuses to parse a timestamp with no offset. A
>>> truly symmetric implementation would have accepted it as an naive
>>> timestamp.
>>>
>>> Too late for %z because it must remain backward compatible, but perhaps
>>> %:z can be made to accept a missing offset as a naive timestamp. The user
>>> can then check for naive timestamp and reject them if they are unacceptable
>>> in that context, rather than specifying whether a missing timestamp is
>>> acceptable or not in the format string. I have no problem with either
>>> solution.
>>>
>>>>
>>>> *Implementation:*
>>>> I am happy to work with PaulG in the isoparse implementation if we
>>>> decide to go with it and if he wants to get involved :)
>>>>
>>>
>>> I have a working strptime:
>>> https://github.com/orent/cpython/tree/strptime_ extensions
>>>
>>> isoparse() on top of this strptime is a trivial one-liner.
>>>
>>> Oren
>>>
>>>>
>>>>
>>>> *Thanks:*
>>>> Thanks for dedicating time to this, I think that even if minor this
>>>> would be a killer addition to 3.7 if we manage to get it through.
>>>>
>>>> On 21 October 2017 at 07:34, Oren Tirosh <orent@hishome.net> wrote:
>>>>
>>>>> ok, let's try to separate the issues and choices on each one:
>>>>>
>>>>> 1. Extending strptime to support time zone offset with : separator:
>>>>> Should a single directive accepts either hhmm or by:mm or use two
>>>>> separate directives?
>>>>>
>>>>> 2. Round tripping of isoformat() back to datetime value:
>>>>> Implement custom isoparse() function or extend strptime so isoparse
>>>>> simply calls strptime with a default format?
>>>>> Support all variations produced by isoformat or just a subset?
>>>>> (Variations include with/without fraction, with/without tz and separator
>>>>> choice)
>>>>>
>>>>> I suggest 1 separate directives 2a extend strptime and 2b support all
>>>>> variations. Do you have different preferences on any of these questions?
>>>>>
>>>>> I understand that the number of extensions to support this seems
>>>>> excessive to you.
>>>>>
>>>>> Technically, my proposed "%.f" is not really necessary. I added it for
>>>>> completeness. We can keep using ".%f" for non-optional fraction and define
>>>>> "%?f" to implicitly include the dot.
>>>>>
>>>>> The distinction between "%z", "%:z" and "%?:z"" can also be narrowed
>>>>> down. This can be done, for example, by making "%z" and "%?s" always accept
>>>>> hhmm with or without the : separator.
>>>>>
>>>>> On Fri, 20 Oct 2017 at 17:16, Paul G <paul@ganssle.io> wrote:
>>>>>
>>>>>> I think this would be a much bigger change to the strptime interface
>>>>>> than is actually warranted, and probably would add in additional,
>>>>>> unnecessary complexity by introducing the concept of optional matches.
>>>>>> Adding the capability to match HH:MM offsets is a reasonable extension
>>>>>> partially because that is a standard representation that is currently *not*
>>>>>> covered by strptime, and the fact that that's how isoformat() represents
>>>>>> the offset just makes this lack all the more acute.
>>>>>>
>>>>>> I think it should be uncontroversial to add *one* of these two %z
>>>>>> extensions to Python 3 without getting bogged down in allowing a single
>>>>>> strptime string to match any output from `.isoformat`.
>>>>>>
>>>>>> That said, I'm also very much in favor of a `.isoparse` or
>>>>>> `.fromisoformat` constructor that *is* the inverse of `isoformat`, which
>>>>>> should solve the issue without sweeping changes to how `strptime` works.
>>>>>>
>>>>>> On 10/19/2017 04:07 PM, Oren Tirosh wrote:
>>>>>>> https://github.com/orent/cpython/tree/strptime_ extensions
>>>>>>>
>>>>>>> %:z - matches +HH:MM
>>>>>>> %?:z - optional %:z
>>>>>>> %.f - equivalent to .%f
>>>>>>> %?.f - optional %.f
>>>>>>> %?t - matches ' ' or 'T'
>>>>>>>
>>>>>>> What they all have in common is that together they make it possible
>>>>>> to
>>>>>>> write a strptime format that matches all possible output variations
>>>>>> of
>>>>>>> datetime.__str__/ datetime.isoformat.
>>>>>>>
>>>>>>> The time zone not only supports the : separator but also allows
>>>>>> making the
>>>>>>> entire component optional, as isoformat() will add it only for aware
>>>>>>> datetime objects. The seconds fraction is dropped from the default
>>>>>> string
>>>>>>> representation if the datetime represents a whole second. Since it is
>>>>>>> dropped along with the decimal dot, I first made "%.f" that includes
>>>>>> the
>>>>>>> dot and then created the optional variant. Finally, "%?t" can be
>>>>>> used to
>>>>>>> accept a timestamp with either of the separators defined in iso8601.
>>>>>>>
>>>>>>> It is quite absurd that datetime cannot parse its own string
>>>>>>> representation. Using these extensions an .isoparse() method may be
>>>>>> added
>>>>>>> that calls strptime('%Y-%m-%d%?t%H:%M:%S%?.f%?:z') and supports full
>>>>>>> round-tripping of all possible datetime values that do not not use a
>>>>>> custom
>>>>>>> tzinfo.
>>>>>>>
>>>>>>> Oren
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, 19 Oct 2017 at 17:06, Paul G <paul@ganssle.io> wrote:
>>>>>>>>
>>>>>>>> There is a new issue about the %z directive in strptime on the issue
>>>>>>> tracker: https://bugs.python.org/issue31800 (linked to a few related
>>>>>>> issues), and a linked PR expanding the definition of %z to match
>>>>>> HH:MM:
>>>>>>> https://github.com/python/cpython/pull/4015
>>>>>>>>
>>>>>>>> I think either adding a %:z directive or expanding the definition
>>>>>> of %z
>>>>>>> would be pretty important, and I think there's a good case to be
>>>>>> made for
>>>>>>> either one. To summarize the arguments for people on the mailing
>>>>>> list:
>>>>>>>>
>>>>>>>> The argument for expanding the definition of %z that I find
>>>>>> strongest is
>>>>>>> that according to the linux man pages (
>>>>>>> http://man7.org/linux/man-pages/man3/strptime.3.html ), while %z
>>>>>> generates
>>>>>>> +-HHMM in strftime, strptime is supposed to match "An RFC-822/ISO
>>>>>> 8601
>>>>>>> standard timezone specification",and ISO 8601 uses +-HH:MM, so if
>>>>>> we're
>>>>>>> following those linux pages, we should be accepting the version with
>>>>>> the
>>>>>>> colon.
>>>>>>>>
>>>>>>>> The argument that I find most compelling for adding a %:z directive
>>>>>> are:
>>>>>>>>
>>>>>>>> 1. maintains the symmetry between strftime and strptime
>>>>>>>> 2. allows users to be stricter about their datetime format
>>>>>>>> 3. has precedent in that GNU's `date` command accepts %z, %:z
>>>>>> and
>>>>>>> %::z formats
>>>>>>>>
>>>>>>>> Can we establish some consensus on which should be done so that it
>>>>>> can be
>>>>>>> implemented?
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Paul
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Datetime-SIG mailing list
>>>>>>>> Datetime-SIG@python.org
>>>>>>>> https://mail.python.org/mailman/listinfo/datetime-sig
>>>>>>>> The PSF Code of Conduct applies to this mailing list:
>>>>>>> https://www.python.org/psf/codeofconduct/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Datetime-SIG mailing list
>>>>>>> Datetime-SIG@python.org
>>>>>>> https://mail.python.org/mailman/listinfo/datetime-sig
>>>>>>> The PSF Code of Conduct applies to this mailing list:
>>>>>> https://www.python.org/psf/codeofconduct/
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Datetime-SIG mailing list
>>>>>> Datetime-SIG@python.org
>>>>>> https://mail.python.org/mailman/listinfo/datetime-sig
>>>>>> The PSF Code of Conduct applies to this mailing list:
>>>>>> https://www.python.org/psf/codeofconduct/
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Datetime-SIG mailing list
>>>>> Datetime-SIG@python.org
>>>>> https://mail.python.org/mailman/listinfo/datetime-sig
>>>>> The PSF Code of Conduct applies to this mailing list:
>>>>> https://www.python.org/psf/codeofconduct/
>>>>>
>>>>>
>>>>
>>
>
>
>
> _______________________________________________
> Datetime-SIG mailing list
> Datetime-SIG@python.org
> https://mail.python.org/mailman/listinfo/datetime-sig
> The PSF Code of Conduct applies to this mailing list: https://www.python.org/psf/codeofconduct/
>
_______________________________________________
Datetime-SIG mailing list
Datetime-SIG@python.org
https://mail.python.org/mailman/listinfo/datetime-sig
The PSF Code of Conduct applies to this mailing list: https://www.python.org/psf/codeofconduct/