PEP 321: Date/Time Parsing and Formatting

John Roth newsgroups at jhrothjr.com
Tue Nov 18 23:36:36 CET 2003


"Paul Moore" <pf_moore at yahoo.co.uk> wrote in message
news:ekw5gxzo.fsf at yahoo.co.uk...
> "John Roth" <newsgroups at jhrothjr.com> writes:
>
> > The trouble with "guess the format" is that it's not possible
> > to do it correctly in the general case from one sample.
> > Given enough samples of one consistent format, it's
> > certainly possible. However, that's a two pass process.
>
> I think you can do it with a hint or two. The key one is whether in
> ambiguous cases, you choose DD/MM or MM/DD. You need a second hint
> with 2-digit years, as 01-02-03 is *very* ambiguous (given that
> putting the year in the middle is insane, you only need a flag saying
> whether the year is at the start or the end).
>
> I'm not sure what other ambiguities you'd need to cater for?

Those are basically it. I've played around with doing "intelligent"
parsing, and I'm absolutely against providing hints. If you're
processing a file with dates all in one format, it will frequently
give the wrong answers for a substantial number of them.
In other words, hints don't give your program the capacity
to learn from experiance. Scanning a number of cases and
noting which fields contained numbers > 31 (or 0), or numbers
greater than 12, does.

In any case, there are three formats, and you can't
always depend on a delimiter to tell you where the year
is for 8 digit inputs. Lots of the inputs I've seen have not
had delimiters. On the other hand, a lot of them have
been guaranteed to be in the late 19th century or later.
That's a hint worth having.

John Roth
>
> Paul.
> -- 
> This signature intentionally left blank






More information about the Python-list mailing list