How do I decode unicode characters in the subject using email.message_from_string()?

Gabriel Genellina gagsl-py2 at
Wed Feb 25 13:59:17 EST 2009

En Wed, 25 Feb 2009 16:19:35 -0200, Thorsten Kampe  
<thorsten at> escribió:
> * Tim Golden (Wed, 25 Feb 2009 17:27:07 +0000)
>> Thorsten Kampe wrote:
>> > * Gabriel Genellina (Wed, 25 Feb 2009 14:00:16 -0200)
>> >> En Wed, 25 Feb 2009 13:40:31 -0200, Thorsten Kampe
> [...]
>> >>> And I wonder why you would think the header contains Unicode  
>> characters
>> >>> when it says "us-ascii" ("=?us-ascii?Q?"). I think there is a  
>> tendency
>> >>> to label everything "Unicode" someone does not understand.
>> >> And I wonder why you would think the header does *not* contain  
>> Unicode
>> >> characters when it says "us-ascii"?.
>> >
>> > Basically because it didn't contain any Unicode characters (anything
>> > outside the ASCII range).
>> And I imagine that Gabriel's point was -- and my point certainly
>> is -- that Unicode includes all the characters *inside* the
>> ASCII range.
> I know that this was Gabriel's point. And my point was that Gabriel's
> point was pointless. If you call any text (or character) "Unicode" then
> the word "Unicode" is generalized to an extent where it doesn't mean
> anything at all anymore and becomes a buzz word.

If it's text, it should use Unicode. Maybe not now, but in a few years, it  
will be totally unacceptable not to properly use Unicode to process  
textual data.

> With the same reason you could call ASCII an Unicode encoding (which it
> isn't) because all ASCII characters are Unicode characters (code
> points). Only encodings that cover the full Unicode range can reasonably
> be called Unicode encodings.

Not at all. ASCII is as valid as character encoding ("coded character set"  
as the Unicode guys like to say) as ISO 10646 (which covers the whole  

> The OP just saw some "weird characters" in the email subject and thought
> "I know. It looks weird. Must be Unicode". But it wasn't. It was good
> ole ASCII - only Quoted Printable encoded.

Good f*cked ASCII is Unicode too.

Gabriel Genellina

More information about the Python-list mailing list