[Python-bugs-list] [ python-Bugs-732761 ] 2.3 breaks email date parsing
SourceForge.net
noreply@sourceforge.net
Wed, 07 May 2003 03:21:59 -0700
Bugs item #732761, was opened at 2003-05-05 15:56
Message generated for change (Comment added) made by ppsys
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=732761&group_id=5470
Category: Python Library
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Daniel Berlin (dberlin)
Assigned to: Nobody/Anonymous (nobody)
Summary: 2.3 breaks email date parsing
Initial Comment:
Parsing dates in emails is broken in 2.3 compared to 2.2.2.
Changing parsedate_tz back to what it was in 2.2.2
fixes it.
I'm not sure who or why this change was made, but it
clearly doesn't handle cases it used to:
(oldparseaddr is the 2.3 version with the patch at the
bottom applied, which reverts it to what it was in 2.2.2)
>>> import _parseaddr
>>> _parseaddr.parsedate_tz("3 Mar 2001 02:04:50 -0000")
>>> import oldparseaddr
>>> oldparseaddr.parsedate_tz("3 Mar 2001 02:04:50 -0000")
(2001, 3, 3, 2, 4, 50, 0, 0, 0, 0)
>>>
The problem is obvious from looking at the new code:
The old version would only care if it actually found
something it needed to delete. The new version assumes
there *must* be a comma in the date if there is no
dayname, and if there isn't, returns nothing.
I wanted to know if this was a mistake, or done on
purpose. If it's a mistake, i'll submit a patch to
sourceforge to fix it.
Index: _parseaddr.py
===================================================================
RCS file:
/cvsroot/python/python/dist/src/Lib/email/_parseaddr.py,v
retrieving revision 1.5
diff -u -3 -p -r1.5 _parseaddr.py
--- _parseaddr.py 17 Mar 2003 18:35:42 -0000 1.5
+++ _parseaddr.py 2 May 2003 15:42:30 -0000
@@ -49,14 +49,9 @@ def parsedate_tz(data):
data = data.split()
# The FWS after the comma after the day-of-week is
optional, so search and
# adjust for this.
- if data[0].endswith(',') or data[0].lower() in
_daynames:
+ if data[0][-1] in (',', '.') or data[0].lower() in
_daynames:
# There's a dayname here. Skip it
del data[0]
- else:
- i = data[0].rfind(',')
- if i < 0:
- return None
- data[0] = data[0][i+1:]
if len(data) == 3: # RFC 850 date, deprecated
stuff = data[0].split('-')
if len(stuff) == 3:
----------------------------------------------------------------------
Comment By: Richard Barrett (ppsys)
Date: 2003-05-07 10:21
Message:
Logged In: YES
user_id=75166
The following patch solves the problem and is tolerant of
slightly malformed dates with a dayname which is not
properly delimted by white space.
diff -u -r -P email/_parseaddr.py email-tzp/_parseaddr.py
--- email/_parseaddr.py 2003-03-30
21:21:28.000000000 +0100
+++ email-tzp/_parseaddr.py 2003-05-06
21:56:37.000000000 +0100
@@ -54,9 +54,9 @@
del data[0]
else:
i = data[0].rfind(',')
- if i < 0:
- return None
- data[0] = data[0][i+1:]
+ if i >= 0:
+ # There's what may be a dayname here. Skip it
+ data[0] = data[0][i+1:]
if len(data) == 3: # RFC 850 date, deprecated
stuff = data[0].split('-')
if len(stuff) == 3:
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=732761&group_id=5470