[Python-Dev] 2.3 broke email date parsing

Daniel Berlin dberlin@dberlin.org
Fri, 2 May 2003 11:58:03 -0400


Parsing dates in emails is broken in 2.3 compared to 2.2.2.
Changing parsedate_tz back to what it was in 2.2.2 fixes it.
I'm not sure who or why this change was made, but it clearly doesn't 
handle cases it used to:
(oldparseaddr is the 2.3 version with the patch at the bottom applied, 
which reverts it to what it was in 2.2.2)

 >>> import _parseaddr
 >>> _parseaddr.parsedate_tz("3 Mar 2001 02:04:50 -0000")
 >>> import oldparseaddr
 >>> oldparseaddr.parsedate_tz("3 Mar 2001 02:04:50 -0000")
(2001, 3, 3, 2, 4, 50, 0, 0, 0, 0)
 >>>

The problem is obvious from looking at the new code:
The old version would only care if it actually found something it 
needed to delete. The new version assumes there *must* be a comma in 
the date if there is no dayname, and if there isn't, returns nothing.

I wanted to know if this was a mistake, or done on purpose.  If it's a 
mistake, i'll submit a patch to sourceforge to fix it.

Index: _parseaddr.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/email/_parseaddr.py,v
retrieving revision 1.5
diff -u -3 -p -r1.5 _parseaddr.py
--- _parseaddr.py       17 Mar 2003 18:35:42 -0000      1.5
+++ _parseaddr.py       2 May 2003 15:42:30 -0000
@@ -49,14 +49,9 @@ def parsedate_tz(data):
      data = data.split()
      # The FWS after the comma after the day-of-week is optional, so 
search and
      # adjust for this.
-    if data[0].endswith(',') or data[0].lower() in _daynames:
+    if data[0][-1] in (',', '.') or data[0].lower() in _daynames:
          # There's a dayname here. Skip it
          del data[0]
-    else:
-        i = data[0].rfind(',')
-        if i < 0:
-            return None
-        data[0] = data[0][i+1:]
      if len(data) == 3: # RFC 850 date, deprecated
          stuff = data[0].split('-')
          if len(stuff) == 3: