[Python-Dev] 2.3 broke email date parsing
Daniel Berlin
dberlin@dberlin.org
Fri, 2 May 2003 11:58:03 -0400
Parsing dates in emails is broken in 2.3 compared to 2.2.2.
Changing parsedate_tz back to what it was in 2.2.2 fixes it.
I'm not sure who or why this change was made, but it clearly doesn't
handle cases it used to:
(oldparseaddr is the 2.3 version with the patch at the bottom applied,
which reverts it to what it was in 2.2.2)
>>> import _parseaddr
>>> _parseaddr.parsedate_tz("3 Mar 2001 02:04:50 -0000")
>>> import oldparseaddr
>>> oldparseaddr.parsedate_tz("3 Mar 2001 02:04:50 -0000")
(2001, 3, 3, 2, 4, 50, 0, 0, 0, 0)
>>>
The problem is obvious from looking at the new code:
The old version would only care if it actually found something it
needed to delete. The new version assumes there *must* be a comma in
the date if there is no dayname, and if there isn't, returns nothing.
I wanted to know if this was a mistake, or done on purpose. If it's a
mistake, i'll submit a patch to sourceforge to fix it.
Index: _parseaddr.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/email/_parseaddr.py,v
retrieving revision 1.5
diff -u -3 -p -r1.5 _parseaddr.py
--- _parseaddr.py 17 Mar 2003 18:35:42 -0000 1.5
+++ _parseaddr.py 2 May 2003 15:42:30 -0000
@@ -49,14 +49,9 @@ def parsedate_tz(data):
data = data.split()
# The FWS after the comma after the day-of-week is optional, so
search and
# adjust for this.
- if data[0].endswith(',') or data[0].lower() in _daynames:
+ if data[0][-1] in (',', '.') or data[0].lower() in _daynames:
# There's a dayname here. Skip it
del data[0]
- else:
- i = data[0].rfind(',')
- if i < 0:
- return None
- data[0] = data[0][i+1:]
if len(data) == 3: # RFC 850 date, deprecated
stuff = data[0].split('-')
if len(stuff) == 3: