[Python-Dev] unicode/string asymmetries

Martin v. Loewis martin@v.loewis.de
Thu, 10 Jan 2002 21:27:46 +0100


> windows "ansi" is an alias for the encoding you get from
> 
>     import locale
>     language, encoding = locale.getdefaultlocale()
> 
> for people in western europe/north america

Isn't that also known as "mbcs" in Python? And it is different from
"oem", which is not exposed to Python, right?

> "cp1252", which is a microsoft version of latin-1:
> 
>     http://www.microsoft.com/typography/unicode/1252.htm
> 
> (characters 0x80-0x9f isn't part of iso-8859-1, aka latin-1)

Strictly speaking, the characters 0x80-0x9f *are* assigned in latin-1,
to control characters - so these assignments differ in CP 1252.

Regards,
Martin