[ python-Bugs-1588217 ] quoted printable parse the sequence '= ' incorrectly
SourceForge.net
noreply at sourceforge.net
Tue Oct 31 22:18:05 CET 2006
Bugs item #1588217, was opened at 2006-10-31 13:06
Message generated for change (Comment added) made by tungwaiyip
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1588217&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Wai Yip Tung (tungwaiyip)
Assigned to: Nobody/Anonymous (nobody)
Summary: quoted printable parse the sequence '= ' incorrectly
Initial Comment:
>>> import quopri
>>> s = 'I say= a secret message\r\nThank you'
>>> quopri.a2b_qp
<built-in function a2b_qp>
>>> quopri.decodestring(s) # use the c version
binascii.a2b_qp() to decode
'I sayThank you'
>>> quopri.a2b_qp=None
>>> quopri.decodestring(s) # use the python version
quopri.decode() to decode
'I say= a secret message\nThank you'
Note that the sequence '= ' is invalid according to
RFC 2045 section 6.7:
-------------------------------------------------------
An "=" followed by a character that is neither a
hexadecimal digit (including "abcdef") nor the CR
character of a CRLF pair is illegal ... A reasonable
approach by a robust implementation might be to
include the "=" character and the following character
in the decoded data without any transformation
-------------------------------------------------------
The lenient interpretation is used by the Python
version parser quopri.decode() to produce the second
string. Most email clients use a similar lenient
interpretation.
The C version parser binascii.a2b_qp(), which is used
in preference to the Python verison, produce a
surprising result with the string 'a secret message'
omitted.
This may create an opportunity for spammers to insert
secret message after '= ' so that it is not visible to
Python based spam filter but woiuld display in non-
Python based email client.
----------------------------------------------------------------------
>Comment By: Wai Yip Tung (tungwaiyip)
Date: 2006-10-31 13:18
Message:
Logged In: YES
user_id=561546
The problem may come from binascii_a2b_qp() in binascii.c. It
considers the '= ' or '=\t' sequence as a soft line break. Such
interpretation appears to have no basis. It could be an
misinterpretation of RFC 2045:
-------------------------------------------------------------------
In particular, an "=" at the end of an encoded line, indicating a
soft line break (see rule #5) may follow one or more TAB (HT) or
SPACE characters.
-------------------------------------------------------------------
This passage reminds readers they might find TAB or SPACE before
an "=", but not after it. "= " is plain illegal as far as I know.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1588217&group_id=5470
More information about the Python-bugs-list
mailing list