[Python-ideas] Processing surrogates in
Stephen J. Turnbull
stephen at xemacs.org
Tue May 5 12:46:41 CEST 2015
Andrew Barnert writes:
> (I'm not sure if we actually have a UCS-2 codec, but if not, it's
> trivial to write--it's just UTF-16 without surrogates.)
The PEP 393 machinery knows when astral characters are introduced
because it has to widen the representation. That might be a more
convenient place to raise an exception on non-BMP characters.
More information about the Python-ideas
mailing list