[docs] [issue19539] The 'raw_unicode_escape' codec buggy + not appropriate for Python 3.x

Martin Panter report at bugs.python.org
Mon Dec 22 07:01:56 CET 2014


Martin Panter added the comment:

[Edit Error: 'utf8' codec can't decode byte 0xe2 in position 212: invalid continuation byte]


Re-reading the suggested description, it struck me that for encoding, this is redundant with the “backslashreplace” error handler:

>>> test = "".join(map(chr, range(sys.maxunicode + 1)))
>>> test.encode("raw-unicode-escape") == test.encode("latin-1", "backslashreplace")
True

However, decoding also seems similar to “unicode_escape”, except that only \uXXXX and \UXXXXXXXX seem to be supported.

Maybe there should be a warning that backslashes are not escaped:

>>> "\\u005C".encode("raw-unicode-escape").decode("raw-unicode-escape")
'\\'

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19539>
_______________________________________


More information about the docs mailing list