[Email-SIG] Problem Report for email.Utils.decode_rfc2231

Georg Graf georg.graf at wu-wien.ac.at
Fri Jul 7 10:52:40 CEST 2006


Hi Email Gurus!

We are running a python milter for some 2 years now and this is
the first time I've ran into problems with the python Email
parser. And we dont have little Mail volume ;) So great work, but
see my problematic message below:

There are 2 assumptions in email.Utils.decode_rfc2231 I do not
understand.

Assumption 1: The string passed either has zero single-quotes or
more than 1.

Assumption 2: If the string has two or more single-quotes the
meaning of the parts is different. I dont know rfc2231, so what
do I write. But still it seems funny to me.

Fact is in this mail (generated by a recent thunderbird version)
there is only one single quote in the filename and the function
fails, see below.

My fix would be to write "if len(parts) != 3", but I'm interested
what you say (it fixes this specific problem, I'd say).

regards and thanks,

  George

Ok, so many words, such a small problem, here the data:

------- message -------
From nobody Thu Jul  6 14:29:50 2006
Content-Type: application/pdf;
	name*0="LZ zu AB 481284, getronics 4500247115 + 4500219041, WU,
	SSU's.pd"; name*1="f"
Content-Transfer-Encoding: base64
Content-Disposition: inline;
	filename*0="LZ zu AB 481284, getronics 4500247115 + 4500219041, WU,
	SSU'"; filename*1="s.pdf"

JVBERi0xLjQNCiX/////DQoxIDAgb2JqDTw8DS9UeXBlIC9DYXRhbG9nDS9QYWdlcyAzNiAw
IFINPj4NZW5kb2JqDTIgMCBvYmoNPDwNL1R5cGUgL1BhZ2UNL1BhcmVudCAzNiAwIFINL01l
ZGlhQm94IFswIDAgNTk1IDg0MV0NL1Jlc291cmNlcyA8PA0vUHJvY1NldCBbL1BERiAvVGV4
dCAvSW1hZ2VCIC9JbWFnZUMgL0ltYWdlSV0NL0NvbG9yU3BhY2UgPDwgL0NTMSA1IDAgUiAv
Q1MyIDYgMCBSID4+DS9Gb250IDw8IC9GMTcgNyAwIFIgL0YxOCAxMyAwIFIgL0YxOSAxOSAw
IFIgL0YyMCAyNSAwIFIgPj4NL1hPYmplY3QgPDwgL0ltOSAzMSAwIFIgL0ltMTMgMzMgMCBS
ID4+DT4+DS9Db250ZW50cyBbMyAwIFJdDT4+DWVuZG9iag0zIDAgb2JqDTw8IC9MZW5ndGgg
NCAwIFIgL0ZpbHRlciAvRmxhdGVEZWNvZGUgPj4Nc3RyZWFtDQp4XtVaWW8cNxJ+N6D/QM1M
a0YtdYts9i3bSmLLsZN4fURx1sm85QKCVRabF//9rSpW8Wj1jEaLYIHAgDycKX6s42Oxit1G
------- end message -------
                                                                        
------- traceback -------
# save mail above as rfc2231-crash.txt
>>> x = email.message_from_file(file ("rfc2231-crash.txt"))
>>> x.get_filename()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/local/lib/python2.4/email/Message.py", line 707, in
get_filename
    `filename' parameter, and it is unquoted.  If that header is missing
  File "/usr/local/lib/python2.4/email/Message.py", line 590, in
get_param
    """
  File "/usr/local/lib/python2.4/email/Message.py", line 537, in
_get_params_preserve
    name = p.strip()
  File "/usr/local/lib/python2.4/email/Utils.py", line 275, in
decode_params
    charset, language, value = decode_rfc2231(EMPTYSTRING.join(value))
  File "/usr/local/lib/python2.4/email/Utils.py", line 222, in
decode_rfc2231
    charset, language, s = parts
ValueError: need more than 2 values to unpack
------- end traceback -------

------- debug seesion -------
> /usr/local/lib/python2.4/email/Message.py(537)_get_params_preserve()
-> params = Utils.decode_params(params)
(Pdb) params
[('inline', ''), ('filename*0', '"LZ zu AB 481284, getronics 4500247115
+ 4500219041, WU, SSU\'"'), ('filename*1', '"s.pdf"')]

[...]

> /usr/local/lib/python2.4/email/Utils.py(275)decode_params()
-> charset, language, value = decode_rfc2231(EMPTYSTRING.join(value))
(Pdb) value
["LZ zu AB 481284, getronics 4500247115 + 4500219041, WU, SSU'",
's.pdf']
(Pdb) EMPTYSTRING.join(value)
"LZ zu AB 481284, getronics 4500247115 + 4500219041, WU, SSU's.pdf"

[...]

> /usr/local/lib/python2.4/email/Utils.py(222)decode_rfc2231()
-> charset, language, s = parts
(Pdb) parts
['LZ zu AB 481284, getronics 4500247115 + 4500219041, WU, SSU', 's.pdf']
(Pdb) s
ValueError: 'need more than 2 values to unpack'
------- end debug seesion -------

------- culprit -------
def decode_rfc2231(s):
    """Decode string according to RFC 2231"""
    import urllib
    parts = s.split("'", 2)
    if len(parts) == 1:
    ^^^^^^^^^^^^^^^^^^^ ------------------ <<<<<<<<<<<<
        return None, None, urllib.unquote(s)
    charset, language, s = parts
    return charset, language, urllib.unquote(s)
------- end culprit -------

--
Vienna University of Economics and Business Administration
Central and Internet Services Section
Center for Computer Services
UNIX Server Administration
PGP/GPG Key ID: 0xa5232ad5
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://mail.python.org/pipermail/email-sig/attachments/20060707/f08298d5/attachment.pgp 


More information about the Email-SIG mailing list