[Python-Dev] PEP 393 Summer of Code Project

"Martin v. Löwis" martin at v.loewis.de
Wed Aug 24 19:54:06 CEST 2011


>> Eg, display of characters in the interpreter.
> 
> I don't know why you say it's "done in terms of UTF-16", then. Unicode
> strings are simply encoded to whatever character set is detected as the
> terminal's character set.

I think what he means (and what I meant when I said something similar):
I/O will consider surrogate pairs in the representation when converting
to the output encoding. This is actually relevant only for UTF-8 (I
think), which converts surrogate pairs "correctly". This can be taken
as a proof that Python 3.2 is "UTF-16 aware" (in some places, but not in
others).

With Python's I/O architecture, it is of course not *actually* the I/O
which considers UTF-16, but the codec.

Regards,
Martin


More information about the Python-Dev mailing list