To unicode or not to unicode

Thorsten Kampe thorsten at thorstenkampe.de
Sat Feb 21 19:20:12 CET 2009


* Ross Ridge (Sat, 21 Feb 2009 12:22:36 -0500)
> =?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=  <martin at v.loewis.de> wrote:
> >I don't think that was the complaint. Instead, the complaint was
> >that the OP's original message did not have a Content-type header,
> >and that it was thus impossible to tell what the byte in front of
> >"Wiki" meant. To properly post either MICRO SIGN or GREEK SMALL LETTER
> >MU in a usenet or email message, you really must use MIME. (As both
> >your article and Thorsten's did, by choosing UTF-8)
> 
> MIME only applies Internet e-mail messages.

No, it doesn't: "MIME's use, however, has grown beyond describing the 
content of e-mail to describing content type in general. [...]

The content types defined by MIME standards are also of importance 
outside of e-mail, such as in communication protocols like HTTP [...]"

http://en.wikipedia.org/wiki/MIME

> RFC 1036 doesn't require nor give a meaning to a Content-Type header
> in a Usenet message

Well, /maybe/ the reason for that is that RFC 1036 was written in 1987 
and the first MIME RFC in 1992...? The "Son of RFC 1036" mentions MIME 
more often than you can count.

> so there's nothing wrong with the original poster's newsreader.

If you follow RFC 1036 (who was written before anyone even thought of 
MIME) then all content has to ASCII. The OP used non ASCII letters.

It's all about declaring your charset. In Python as well as in your 
newsreader. If you don't declare your charset it's ASCII for you - in 
Python as well as in your newsreader.

Thorsten



More information about the Python-list mailing list