[Python-Dev] Support of UTF-16 and UTF-32 source encodings

Random832 random832 at fastmail.com
Sat Nov 14 21:10:02 EST 2015


Glenn Linderman <v+python at g.nevcal.com> writes:
> On 11/14/2015 5:37 PM, Chris Angelico wrote:
> > Thanks. Is "ANSI" always an eight-bit ASCII-compatible encoding?
>
> I wouldn't trust an answer to this question that didn't come from
> someone that used Windows with Chinese, Japanese, or Korean, as their
> default language for the install. So I don't have a trustworthy answer
> to give.

AFAIK (I haven't actually used it as a default language, but I do know
some details of their encodings) There are two main "issues" with the
windows CJK encodings regarding ASCII compatibility:

- There is a different symbol (a currency symbol) at 0x5c. Sort of. Most
  unicode translations of it will treat it as a backslash, and users do
  expect it to work for things like \n, path separators, etc, but it
  displays as ¥ or ₩.

- Dual-byte characters can have ASCII bytes as their *second* byte.



More information about the Python-Dev mailing list