[Python-Dev] unicode/string asymmetries

Fredrik Lundh fredrik@pythonware.com
Thu, 10 Jan 2002 17:38:27 +0100


thomas wrote:
> I have a string variable containing some non-ascii characters (from
> a characterset which was previously called 'ansi' instead of 'oem'
> on windows).

short answer: "iso-8859-1" should work

:::

longer answer:

windows "ansi" is an alias for the encoding you get from

    import locale
    language, encoding = locale.getdefaultlocale()

for people in western europe/north america, that's usually
"cp1252", which is a microsoft version of latin-1:

    http://www.microsoft.com/typography/unicode/1252.htm

(characters 0x80-0x9f isn't part of iso-8859-1, aka latin-1)

cheers /F