Various strings to dates.
Skip Montanaro
skip at pobox.com
Fri Jan 23 17:27:57 EST 2004
Amy> The following three date strings is another example of the various
Amy> date formats I will encounter here.
Amy> Thursday, 22 January 2004 03:15:06
Amy> Thursday, January 22, 2004, 03:15:06
Amy> 2004, Thursday, 22 January 03:15:06
Assuming you won't have any ambiguous dates (like 1/3/04), just define
regular expressions which label the various fields of interest, then match
your string against them until you get a hit. For
example, the first would be matched by this:
>>> import re
>>> pat = re.compile(r'(?P<wkday>[A-Z][a-z]+),\s+(?P<day>[0-9]{1,2})\s+'
... r'(?P<month>[A-Z][a-z]+)\s+(?P<year>[0-9]{4,4})')
>>> mat = pat.match('Thursday, 22 January 2004 03:15:06')
>>> mat
<_sre.SRE_Match object at 0x487498>
>>> mat.groups()
('Thursday', '22', 'January', '2004')
>>> mat.group('month')
'January'
etc. (Extending the regexp to accommodate the time is left as an exercise.)
Once you have a match, pull out the relevant bits, maybe tweak them a bit
(int()-ify things), then create a datetime instance from the result.
I do something like this in my dates module. It's old and ugly though:
http://manatee.mojam.com/~skip/python/
Search for "date-parsing module". This was written long before the datetime
module was available and was used for for a slightly different purpose. It
recognizes a number of different date range formats in addition to
individual dates. You might be able to snag some regular expression ideas
from it though.
Skip
More information about the Python-list
mailing list