24 Aug
2011
24 Aug
'11
5:54 p.m.
Eg, display of characters in the interpreter.
I don't know why you say it's "done in terms of UTF-16", then. Unicode strings are simply encoded to whatever character set is detected as the terminal's character set.
I think what he means (and what I meant when I said something similar): I/O will consider surrogate pairs in the representation when converting to the output encoding. This is actually relevant only for UTF-8 (I think), which converts surrogate pairs "correctly". This can be taken as a proof that Python 3.2 is "UTF-16 aware" (in some places, but not in others). With Python's I/O architecture, it is of course not *actually* the I/O which considers UTF-16, but the codec. Regards, Martin