PEP 393 vs UTF-8 Everywhere

Marko Rauhamaa marko at
Sun Jan 22 03:13:10 EST 2017

eryk sun <eryksun at>:

> On Sat, Jan 21, 2017 at 8:21 PM, Pete Forman <petef4+usenet at> wrote:
>> Marko Rauhamaa <marko at> writes:
>>>> py> low = '\uDC37'
>>> That should raise a SyntaxError exception.
>> Quite. [...]
> CPython allows surrogate codes for use with the "surrogateescape" and
> "surrogatepass" error handlers, which are used for POSIX and Windows
> file-system encoding, respectively.

Yes, but at the cost of violating Unicode, leading to unprintable
strings etc. In my opinion, Python should have "stayed pure" instead of
playing cheap tricks with surrogates.

(Of course, Unicode itself is a mess, but that's another story.)


More information about the Python-list mailing list