[ python-Bugs-531205 ] Bugs in rfc822.parseaddr()
SourceForge.net
noreply at sourceforge.net
Thu Jul 22 20:30:55 CEST 2004
Bugs item #531205, was opened at 2002-03-18 06:13
Message generated for change (Comment added) made by jlgijsbers
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=531205&group_id=5470
Category: Python Library
Group: Python 2.2
Status: Open
Resolution: None
Priority: 5
Submitted By: Barry A. Warsaw (bwarsaw)
Assigned to: Ben Gertzfield (che_fox)
Summary: Bugs in rfc822.parseaddr()
Initial Comment:
This bug is in rfc822.parseaddr(), and thus inherited
into email.Utils.parseaddr() since the latter does a
straight include of the former. It has a nasty bug
when the email address contains embedded spaces: it
collapses the spaces:
>>> from email.Utils import parseaddr
>>> parseaddr('foo bar at wooz.org')
('', 'foobar at wooz.org')
>>> parseaddr('<foo bar at wooz.org>')
('', 'foobar at wooz.org')
Boo, hiss. Of course parseaddr() would be more
involved to implement in an RFC 2822 compliant way, but
it would be very cool.
Note that I'm reporting this bug here instead of the
mimelib project because it's actually in rfc822.py.
Once solution might include fixing it in the email
package only.
----------------------------------------------------------------------
Comment By: Johannes Gijsbers (jlgijsbers)
Date: 2004-07-22 20:30
Message:
Logged In: YES
user_id=469548
Well, the docs say "unless the parse fails, in which case a
2-tuple of ('', '') is returned". I think it's reasonable to
say that non-compliant addresses like this should fail to
parse and thus that parseaddr('foo bar at wooz.org') should
returns ('', '')
----------------------------------------------------------------------
Comment By: Tim Roberts (timroberts)
Date: 2002-08-12 23:40
Message:
Logged In: YES
user_id=265762
Interesting to note that RFC 822 (but not 2822) allows spaces
around any periods in the address without quoting (2822 does
allow spaces around the @), and those spaces are to be
removed. Section A.1.4 gives the example
Wilt . Chamberlain at NBA.US
and says it should be parsed as "Wilt.Chamberlain".
Given that, it's hard for me to see that the current behavior
should be changed at all, since there is no correct way to
parse this non-compliant address.
----------------------------------------------------------------------
Comment By: Barry A. Warsaw (bwarsaw)
Date: 2002-04-15 19:18
Message:
Logged In: YES
user_id=12800
Note further that "foo bar"@wooz.org is properly parsed.
The question is, what should parseaddr() do in this
non-compliant situation? I can think of a couple of things:
- it could raise an exception
- it could return ('', 'bar at wooz.org')
- it could return ('foo', 'bar at wooz.org')
- it could return ('' '"foo bar"@wooz.org')
I'm not sure what the right thing to do is. I'm assigning
to Ben Gertzfield to get his opinion. Ben, feel free to add
a comment and re-assign the bug to me.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=531205&group_id=5470
More information about the Python-bugs-list
mailing list