[Python-3000] locale-aware strings ?

"Martin v. Löwis" martin at v.loewis.de
Wed Sep 13 06:51:30 CEST 2006


Brian Quinlan schrieb:
> As a user, I don't have any expectations regarding non-ASCII text files.
> 
> I'm using a US-English version of Windows XP (very common) and I haven't 
> changed the default encoding (very common). Python claims that my system 
> encoding is CP436 (from sys.stdin/stdout.encoding).

You are misinterpreting the data you see. Python makes no claims about
your system encoding in sys.stdout.encoding. Instead, it makes a claim
about your terminal's encoding, and that is indeed CP436 (just do
"type foo.txt" with a document that contains non-ASCII characters,
and watch the characters in the terminal look differently from the
ones in notepad).

It is an unfortunate fact that Windows has *two* system encodings: one
used for "Windows", and one used for the "OEM". The terminal uses the
OEM code page (by default, unless you run chcp.exe).

> I can assure you
> that most of the documents that I work with are not in CP436 - they are 
> a combination of ASCII, ISO8859-1, and UTF-8. I would also guess that 
> this is true of many Windows XP (US-English) users. So, for me and users 
> like me, Python is going to silently misinterpret my data.

No. It will use a different API to determine the system encoding, and
it will guess correctly.

Regards,
Martin


More information about the Python-3000 mailing list