[Python-Dev] len(chr(i)) = 2?
Stephen J. Turnbull
stephen at xemacs.org
Mon Nov 22 11:47:09 CET 2010
"Martin v. Löwis" writes:
> More interestingly (and to the subject) is chr: how did you arrive
> at C9 banning Python3's definition of chr? This chr function puts
> the code sequence into well-formed UTF-16; that's the whole point of
> UTF-16.
No, it doesn't, in the specific case of surrogate code points. In
3.1.2 from MacPorts on a iBook G4 and from Gentoo on AMD64,
chr(0xd800) returns "\ud800".
I don't know if that's by design (eg, so that it can be used in the
implementation of the surrogateescape error handler) or a correctable
oversight, but it's not conformant.
More information about the Python-Dev
mailing list