Various strings to dates.

Skip Montanaro skip at pobox.com
Fri Jan 23 17:27:57 EST 2004


    Amy> The following three date strings is another example of the various
    Amy> date formats I will encounter here.

    Amy> Thursday, 22 January 2004 03:15:06
    Amy> Thursday, January 22, 2004, 03:15:06
    Amy> 2004, Thursday, 22 January 03:15:06

Assuming you won't have any ambiguous dates (like 1/3/04), just define
regular expressions which label the various fields of interest, then match
your string against them until you get a hit.  For
example, the first would be matched by this:

    >>> import re
    >>> pat = re.compile(r'(?P<wkday>[A-Z][a-z]+),\s+(?P<day>[0-9]{1,2})\s+'
    ...      r'(?P<month>[A-Z][a-z]+)\s+(?P<year>[0-9]{4,4})')
    >>> mat = pat.match('Thursday, 22 January 2004 03:15:06')
    >>> mat
    <_sre.SRE_Match object at 0x487498>
    >>> mat.groups()
    ('Thursday', '22', 'January', '2004')
    >>> mat.group('month')
    'January'

etc.  (Extending the regexp to accommodate the time is left as an exercise.)
Once you have a match, pull out the relevant bits, maybe tweak them a bit
(int()-ify things), then create a datetime instance from the result.

I do something like this in my dates module.  It's old and ugly though:

    http://manatee.mojam.com/~skip/python/

Search for "date-parsing module".  This was written long before the datetime
module was available and was used for for a slightly different purpose.  It
recognizes a number of different date range formats in addition to
individual dates.  You might be able to snag some regular expression ideas
from it though.

Skip




More information about the Python-list mailing list