[New-bugs-announce] [issue27364] Deprecate invalid unicode escape sequences

Emanuel Barry report at bugs.python.org
Tue Jun 21 16:34:20 EDT 2016


New submission from Emanuel Barry:

Attached patch deprecates invalid escape sequences in unicode strings. The point of this is to prevent issues such as #27356 (and possibly other similar ones) in the future.

Without the patch:

>>> "hello \world"
'hello \\world'

With the patch:

>>> "hello \world"
DeprecationWarning: invalid escape sequence 'w'

I'll need some help (patch isn't mergeable yet):

test_doctest fails on my machine with the patch (and -W), and I don't know how to fix it. test_ast fails an assertion (!PyErr_Occurred() in PyObject_Call in abstract.c) when -W is on, and I also don't know how to fix it (I don't even know what causes it).

Of course, I went ahead and fixed all instances of invalid escape sequences in the stdlib (that I could find) so that no DeprecationWarning is encountered.

Lastly, I thought about also doing this to bytes, but I ran into some issues with some invalid escapes such as \u, and _codecs.escape_decode would trigger the warning when passed br"\8" (for example). Ultimately, I decided to leave bytes alone for now, since it's mostly on the lower-level side of things. If there's interest I can add it back.

----------
components: Interpreter Core, Library (Lib), Unicode
files: deprecate_invalid_unicode_escapes.patch
keywords: patch
messages: 269022
nosy: ebarry, ezio.melotti, haypo
priority: normal
severity: normal
stage: patch review
status: open
title: Deprecate invalid unicode escape sequences
type: behavior
versions: Python 3.6
Added file: http://bugs.python.org/file43499/deprecate_invalid_unicode_escapes.patch

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue27364>
_______________________________________


More information about the New-bugs-announce mailing list