[Datetime-SIG] strategy for the C part of ISO 8601 datetime parsing

Oren Tirosh orent at hishome.net
Fri Jun 2 06:21:55 EDT 2017


Hi, Mathieu and list

I was also bothered by the fact that datetime can't parse its own __str__()
output. Instead of a custom parser I took a different approach. The
strptime implementation already has some extensions beyond posix/GNU, so
why not add a few more?

%:z  - matches +HH:MM - this is a GNU date(1) extension

%?:z - optional version of %:z

%.f  - equivalent to .%f

%?.f - optional version of %.f

%?t  - matches either ' ' or 'T'

For symmetry, I proceeded to implement the same directives in strftime,
too. So with these changes in place, isoformat/fromisoformat are just
strftime/strptime with the format  "%Y-%m-%d%?t%H:%M:%S%?.f%?:z".

The strftime implementation was more work and is not quite finished. It is
a python-only prototype (requires disabling import from C _datetime
module). It passes the tests but still has a couple of localization issues.
The code has languished in one of my repos for almost a year...

Your message reminded me of this unfinished bit. I will now rebase and dust
it off to see if there is interest on the list.


On Wed, 31 May 2017 at 22:27, Mathieu Dupuy <deronnax at gmail.com> wrote:

> Hi datetime mates
>
> I would like to resume soon the C implementation of datetime iso
> format parsing in CPython I started days ago
> (http://bugs.python.org/issue15873). Currently I have 2 solutions and
> would like to know which one do you prefer:
>
> * iterating on the string the string, stopping when something is wrong
> (might process almost all of the string and finally give up because
> last part is wrong, EG incorrect microseconds or time zone. Penalize
> invalid strings, best case when most of the strings to process are
> valid)
> * first checking the string is correct, then iterating over it and
> handling each part. Early detection of incorrect strings, useless
> overhead for valid string. Penalize valid strings, best case when most
> of the strings to process are invalid).
>
> I have a preference for solution #1. I first thought of using sscanf
> but it's impossible for many reasons, the first of them is scanf is
> unsuitable for variable numbers of match (you can't express optional
> match in scanf format).
>
> Waiting for your input.
> _______________________________________________
> Datetime-SIG mailing list
> Datetime-SIG at python.org
> https://mail.python.org/mailman/listinfo/datetime-sig
> The PSF Code of Conduct applies to this mailing list:
> https://www.python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20170602/088fcb4e/attachment.html>


More information about the Datetime-SIG mailing list