[Python-3000] locale-aware strings ?
"Martin v. Löwis"
martin at v.loewis.de
Wed Sep 13 06:51:30 CEST 2006
Brian Quinlan schrieb:
> As a user, I don't have any expectations regarding non-ASCII text files.
> I'm using a US-English version of Windows XP (very common) and I haven't
> changed the default encoding (very common). Python claims that my system
> encoding is CP436 (from sys.stdin/stdout.encoding).
You are misinterpreting the data you see. Python makes no claims about
your system encoding in sys.stdout.encoding. Instead, it makes a claim
about your terminal's encoding, and that is indeed CP436 (just do
"type foo.txt" with a document that contains non-ASCII characters,
and watch the characters in the terminal look differently from the
ones in notepad).
It is an unfortunate fact that Windows has *two* system encodings: one
used for "Windows", and one used for the "OEM". The terminal uses the
OEM code page (by default, unless you run chcp.exe).
> I can assure you
> that most of the documents that I work with are not in CP436 - they are
> a combination of ASCII, ISO8859-1, and UTF-8. I would also guess that
> this is true of many Windows XP (US-English) users. So, for me and users
> like me, Python is going to silently misinterpret my data.
No. It will use a different API to determine the system encoding, and
it will guess correctly.
More information about the Python-3000