[docs] [issue10546] UTF-16-LE and UTF-16-BE support non-BMP characters

Alexander Belopolsky report at bugs.python.org
Wed Dec 8 22:48:17 CET 2010

Alexander Belopolsky <belopolsky at users.sourceforge.net> added the comment:

If Victor says so ...

Someone needs to check that it works on a UCS4 build, but on a narrow build I don't think UTF-16-XX encodings need to do anything special - they just encode the surrogates as ordinary code units.

>>> '\U00010000'.encode('UTF-16-BE').decode('UTF-16-BE') == '\U00010000'
>>> '\U00010000'.encode('UTF-16-LE').decode('UTF-16-LE') == '\U00010000'


Python tracker <report at bugs.python.org>

More information about the docs mailing list