[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

"Martin v. Löwis" martin at v.loewis.de
Sat Apr 25 14:44:49 CEST 2009


> If the bytes are mapped to single half surrogate codes instead of the
> normal pairs (low+high), then I can see that decoding could never be
> ambiguous and encoding could produce the original bytes.

I was confused by Markus Kuhn's original UTF-8b specification. I have
now changed the PEP to avoid using PUA characters at all.

Regards,
Martin


More information about the Python-Dev mailing list