[Python-bugs-list] [ python-Bugs-732761 ] 2.3 breaks email date parsing

SourceForge.net noreply@sourceforge.net
Wed, 07 May 2003 03:21:59 -0700


Bugs item #732761, was opened at 2003-05-05 15:56
Message generated for change (Comment added) made by ppsys
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=732761&group_id=5470

Category: Python Library
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Daniel Berlin (dberlin)
Assigned to: Nobody/Anonymous (nobody)
Summary: 2.3 breaks email date parsing

Initial Comment:
Parsing dates in emails is broken in 2.3 compared to 2.2.2.
Changing parsedate_tz back to what it was in 2.2.2
fixes it.
I'm not sure who or why this change was made, but it
clearly doesn't handle cases it used to:
(oldparseaddr is the 2.3 version with the patch at the
bottom applied, which reverts it to what it was in 2.2.2)

>>> import _parseaddr
>>> _parseaddr.parsedate_tz("3 Mar 2001 02:04:50 -0000")
>>> import oldparseaddr
>>> oldparseaddr.parsedate_tz("3 Mar 2001 02:04:50 -0000")
(2001, 3, 3, 2, 4, 50, 0, 0, 0, 0)
>>>

The problem is obvious from looking at the new code:
The old version would only care if it actually found
something it needed to delete. The new version assumes
there *must* be a comma in the date if there is no
dayname, and if there isn't, returns nothing.

I wanted to know if this was a mistake, or done on
purpose.  If it's a mistake, i'll submit a patch to
sourceforge to fix it.

Index: _parseaddr.py
===================================================================
RCS file:
/cvsroot/python/python/dist/src/Lib/email/_parseaddr.py,v
retrieving revision 1.5
diff -u -3 -p -r1.5 _parseaddr.py
--- _parseaddr.py       17 Mar 2003 18:35:42 -0000      1.5
+++ _parseaddr.py       2 May 2003 15:42:30 -0000
@@ -49,14 +49,9 @@ def parsedate_tz(data):
     data = data.split()
     # The FWS after the comma after the day-of-week is
optional, so search and
     # adjust for this.
-    if data[0].endswith(',') or data[0].lower() in
_daynames:
+    if data[0][-1] in (',', '.') or data[0].lower() in
_daynames:
         # There's a dayname here. Skip it
         del data[0]
-    else:
-        i = data[0].rfind(',')
-        if i < 0:
-            return None
-        data[0] = data[0][i+1:]
     if len(data) == 3: # RFC 850 date, deprecated
         stuff = data[0].split('-')
         if len(stuff) == 3:



----------------------------------------------------------------------

Comment By: Richard Barrett (ppsys)
Date: 2003-05-07 10:21

Message:
Logged In: YES 
user_id=75166

The following patch solves the problem and is tolerant of 
slightly malformed dates with a dayname which is not 
properly  delimted by white space.

diff -u -r -P email/_parseaddr.py email-tzp/_parseaddr.py
--- email/_parseaddr.py	2003-03-30 
21:21:28.000000000 +0100
+++ email-tzp/_parseaddr.py	2003-05-06 
21:56:37.000000000 +0100
@@ -54,9 +54,9 @@
         del data[0]
     else:
         i = data[0].rfind(',')
-        if i < 0:
-            return None
-        data[0] = data[0][i+1:]
+        if i >= 0:
+           # There's what may be a dayname here. Skip it
+           data[0] = data[0][i+1:]
     if len(data) == 3: # RFC 850 date, deprecated
         stuff = data[0].split('-')
         if len(stuff) == 3:


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=732761&group_id=5470