[New-bugs-announce] [issue2658] decode_header() fails on multiline headers

Christoph Schneeberger report at bugs.python.org
Sat Apr 19 14:41:34 CEST 2008

New submission from Christoph Schneeberger <cschnee at telemedia.ch>:

email.Header.decode_header() does not correctly deal with multiline
header.py in revision 54371 (1) changes the behaviour, whereas
previously multiline headers where parsed correctly, header.py 54371
introduced a new regex part, that renders such headers invalid and they
won't be parsed as expected.
Given the following header line (doesn't matter if its parsed from a
mail or read from a string) which represents IMHO a valid RFC2047 header

from email.Header import decode_header
decode_header('=?windows-1252?Q?=22M=FCller_T=22?=\r\n <T.Mueller at xxx.com>')

this will result in:
header.py (54371):
[('=?windows-1252?Q?=22M=FCller_T=22?=\r\n <T.Mueller at xxx.com>', None)]

resp. with header.py (54370):
[('"M\xfcller T"', 'windows-1252'), (' <T.Mueller at xxx.com>', None)]

Actually both seem parsed wrong, but with 54370 the result looks more
sane (the space should be IMO removed). 
Once the CRLF sequence is removed from the header it works fine and all
looks as expected:
>>> decode_header('=?windows-1252?Q?=22M=FCller_T=22?= <T.Mueller at xxx.com>')
[('"M\xfcller T"', 'windows-1252'), ('<T.Mueller at xxx.com>', None)]

This problem might or might not be related to 
- issue 1372770 
- issue 1467619

(1) http://svn.python.org/view?rev=54371&view=rev

components: Library (Lib)
messages: 65630
nosy: cschnee
severity: normal
status: open
title: decode_header() fails on multiline headers
type: behavior
versions: Python 2.4, Python 2.5

Tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list