email header decoding fails
Gabriel Genellina
gagsl-py2 at yahoo.com.ar
Thu Apr 10 04:31:27 EDT 2008
En Wed, 09 Apr 2008 23:12:00 -0300, ZeeGeek <ZeeGeek at gmail.com> escribió:
> It seems that the decode_header function in email.Header fails when
> the string is in the following form,
>
> '=?gb2312?Q?=D0=C7=C8=FC?=(revised)'
>
> That's when a non-encoded string follows the encoded string without
> any whitespace. In this case, decode_header function treats the whole
> string as non-encoded. Is there a work around for this problem?
That header does not comply with RFC2047 (MIME Part Three: Message Header
Extensions for Non-ASCII Text)
Section 5 (1)
An 'encoded-word' may replace a 'text' token (as defined by RFC 822)
in any Subject or Comments header field, any extension message
header field, or any MIME body part field for which the field body
is defined as '*text'. [...]
Ordinary ASCII text and 'encoded-word's may appear together in the
same header field. However, an 'encoded-word' that appears in a
header field defined as '*text' MUST be separated from any adjacent
'encoded-word' or 'text' by 'linear-white-space'.
Section 5 (3)
As a replacement for a 'word' entity within a 'phrase', for example,
one that precedes an address in a From, To, or Cc header. [...]
An 'encoded-word' that appears within a
'phrase' MUST be separated from any adjacent 'word', 'text' or
'special' by 'linear-white-space'.
--
Gabriel Genellina
More information about the Python-list
mailing list