Decode email subjects into unicode
Jeffrey Froman
jeffrey at fro.man
Tue Mar 18 12:24:03 EDT 2008
Laszlo Nagy wrote:
> I know that "=?UTF-8?B" means UTF-8 + base64 encoding, but I wonder if
> there is a standard method in the "email" package to decode these
> subjects?
The standard library function email.Header.decode_header will parse these
headers into an encoded bytestring paired with the appropriate encoding
specification, if any. For example:
>>> raw_headers = [
... '=?koi8-r?B?4tnT1NLP19nQz8zOyc3PIMkgzcHMz9rB1NLB1M7P?=',
... '[Fwd: re:Flags Of The World, Us States, And Military]',
... '=?ISO-8859-2?Q?=E9rdekes?=',
... '=?UTF-8?B?aGliw6Fr?=',
... ]
>>> from email.Header import decode_header
>>> for raw_header in raw_headers:
... for header, encoding in decode_header(raw_header):
... if encoding is None:
... print header.decode()
... else:
... print header.decode(encoding)
...
Быстровыполнимо и малозатратно
[Fwd: re:Flags Of The World, Us States, And Military]
érdekes
hibák
Jeffrey
More information about the Python-list
mailing list