[Python-Dev] len(chr(i)) = 2?
"Martin v. Löwis"
martin at v.loewis.de
Mon Nov 22 09:20:59 CET 2010
> Unicode 5.0, Chapter 3, verse C9:
>
> When a process generates a code unit sequence which purports to be
> in a Unicode character encoding form, it shall not emit ill-formed
> code sequences.
> > > A Unicode-conforming Python implementation would error at the
> > > chr() call, or perhaps would not provide surrogateescape error
> > > handlers.
> >
> > Chapter and verse?
>
> Chapter 3, verse C9 again.
I agree that the surrogateescape error handler is non-conforming, but,
as you say, it doesn't claim to, either (would your concern about utf-8
being misleading here been resolved if the thing had been called
"utf-8b"?)
More interestingly (and to the subject) is chr: how did you arrive
at C9 banning Python3's definition of chr? This chr function puts
the code sequence into well-formed UTF-16; that's the whole point of
UTF-16.
Regards,
Martin
More information about the Python-Dev
mailing list