[issue1079] decode_header does not follow RFC 2047

Tony Nelson report at bugs.python.org
Sun Apr 5 01:17:23 CEST 2009

Tony Nelson <tony_nelson at users.sourceforge.net> added the comment:

The email package does not follow the RFCs in anything to do with header
parsing or decoding.  This is a known deficiency.  So no, I am not
thinking of atoms at all -- and neither is email.header.decode_header()! :-(

Until email.header actually parses headers into atoms and then decodes
atoms, it doesn't matter what parsed atoms would look like.  Currently,
email.header.decode_header() just stumbles through raw text, and doesn't
know if it is looking at atoms or not, or usually even what header the
text came from.

In order to interpret the RFC correctly, email.header.decode_header()
needs either a parser and the name of the header it is decoding, or
parsed header data.  I think the latter is being considered for a
redesign of the email package for 3.1 or 3.2 (3 months to a year or so,
and not for 2.x at all), but until then, it is better to decode every
likely encoded-word than to skip encoded-words that, for example, have a
parenthesis on one side or the other.


Python tracker <report at bugs.python.org>

More information about the Python-bugs-list mailing list