[Python-Dev] len(chr(i)) = 2?

Stephen J. Turnbull stephen at xemacs.org
Mon Nov 22 11:47:09 CET 2010


"Martin v. Löwis" writes:

 > More interestingly (and to the subject) is chr: how did you arrive
 > at C9 banning Python3's definition of chr? This chr function puts
 > the code sequence into well-formed UTF-16; that's the whole point of
 > UTF-16.

No, it doesn't, in the specific case of surrogate code points.  In
3.1.2 from MacPorts on a iBook G4 and from Gentoo on AMD64,
chr(0xd800) returns "\ud800".

I don't know if that's by design (eg, so that it can be used in the
implementation of the surrogateescape error handler) or a correctable
oversight, but it's not conformant.




More information about the Python-Dev mailing list