[Python-Dev] iso8601 parsing

Chris Barker - NOAA Federal chris.barker at noaa.gov
Thu Dec 7 20:57:23 EST 2017


>but is it that hard to parse arbitrary

ISO8601 strings in once you've gotten this far? It's a bit uglier than I'd

like, but not THAT bad a spec.


No, and in fact this PR is adapted from a *more general* ISO-8601 parser
that I wrote (which is now merged into master on python-dateutil). In the
CPython PR I deliberately limited it to be the inverse of `isoformat()` for
two major reasons:

1. It allows us to get something out there that everyone can agree on - not
only would we have to agree on whether to support arcane ISO8601 formats
like YYYY-Www-D,


I don’t know — would anyone complain about it supporting too arcane a
format?

Also — “most ISO compliant “ date time strings would get us a long way.

but we also have to then discuss whether we want to be strict and disallow
YYYYMM like ISO-8601 does,


Well, I think disallowing something has little utility - we really don’t
want this to be a validator.

do we want fractional minute support? What about different variations
(we're already supporting replacing T with any character in `.isoformat()`
and outputting time zones in the form hh:mm:ss, so what other non-compliant
variations do we want to add..


Wait — does datetime.isoformat() put out non-compliant strings?

Anyway, supporting all of what .isoformat() puts out, plus Most of iso8601
would be a great start.

 - if it comes out of `isoformat()` it should be able to go back in
througuh `fromisoformat()`.


Yup.

But had anyone raised objections to it being more flexible?

2. It makes it *much* easier to understand what formats are supported. You
can say, "This function is for reading in dates serialized with
`.isoformat()`", you *immediately* know how to create compliant dates.


We could still document that as the preferred form.

You’re writing the code, and I don’t have time to help, so by all means do
what you think is best.

But if you’ve got code that’s more flexible, I can’t imagine anyone
complaining about a more flexible parser.

Though I have a limited imagination about such things.

But I hope it will at least accept both with and without the T.

Thanks for working on this.

-Chris

On 12/07/2017 08:12 PM, Chris Barker wrote:


Here is the PR I've submitted:


https://github.com/python/cpython/pull/4699


The contract that I'm supporting (and, I think it can be argued, the only

reasonable contract in the intial implementation) is the following:


   dtstr = dt.isoformat(*args, **kwargs)

   dt_rt = datetime.fromisoformat(dtstr)

   assert dt_rt == dt                    # The two points represent the

same absolute time

   assert dt_rt.replace(tzinfo=None) == dt.replace(tzinfo=None)   # And

the same wall time




that looks good.



I see this in the comments in the PR:



"""

This does not support parsing arbitrary ISO 8601 strings - it is only

intended

as the inverse operation of :meth:`datetime.isoformat`

"""




what ISO8601 compatible features are not supported?


-CHB
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20171207/9a66e7c3/attachment-0001.html>


More information about the Python-Dev mailing list