[issue17777] Unrecognized string literal escape sequences give SyntaxErrors

New submission from Reynir Reynisson: Strings like "\u" trigger a SyntaxError. According to the language reference "all unrecognized escape sequences are left in the string unchanged"[0]. The string "\u" clearly doesn't match any of the escape sequences (in particular \uxxxx). This may be intentional, but it is not clear from the language reference that this is the case. If it is intentional it should probably be stated more explicit in the language reference. I think this may be confusing for new users since the syntax errors may lead them to believe the interpreter will give syntax error for all unrecognized escape sequences. [0]: http://docs.python.org/3/reference/lexical_analysis.html#literals ---------- assignee: docs@python components: Documentation, Unicode messages: 187173 nosy: docs@python, ezio.melotti, reynir priority: normal severity: normal status: open title: Unrecognized string literal escape sequences give SyntaxErrors type: behavior versions: Python 3.3 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17777> _______________________________________

R. David Murray added the comment: It is a recognized escape sequence, but the syntax of the escape sequence is wrong, thus the syntax error. An "escape sequence" is a backslash character followed by a letter. Perhaps that is the bit that needs to be clarified in the docs? ---------- nosy: +r.david.murray _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17777> _______________________________________

Reynir Reynisson added the comment: Thank you for the quick reply. Yes, something along those lines would help. Maybe adding "The escape sequence \x expects exactly two hex digits" would make it even clearer. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17777> _______________________________________

Changes by Ezio Melotti <ezio.melotti@gmail.com>: ---------- keywords: +easy stage: -> needs patch type: behavior -> enhancement versions: +Python 2.7, Python 3.4 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17777> _______________________________________

Mark Egan-Fuller added the comment: Python correctly throws a unicode error here, directing the user towards the fact that this is an issue specifically with the unicode escaping.
The documentation also states that "Any Unicode character can be encoded this way. Exactly eight hex digits are required."[0]. Propose closing this as Won't Fix. [0]: http://docs.python.org/3/reference/lexical_analysis.html#literals ---------- nosy: +markeganfuller, tim.golden _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17777> _______________________________________

Tim Golden added the comment: Closing as "Works for me" in the absence of any clear proposal for docs improvement. ---------- resolution: -> works for me stage: needs patch -> committed/rejected status: open -> closed _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17777> _______________________________________

R. David Murray added the comment: It is a recognized escape sequence, but the syntax of the escape sequence is wrong, thus the syntax error. An "escape sequence" is a backslash character followed by a letter. Perhaps that is the bit that needs to be clarified in the docs? ---------- nosy: +r.david.murray _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17777> _______________________________________

Reynir Reynisson added the comment: Thank you for the quick reply. Yes, something along those lines would help. Maybe adding "The escape sequence \x expects exactly two hex digits" would make it even clearer. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17777> _______________________________________

Changes by Ezio Melotti <ezio.melotti@gmail.com>: ---------- keywords: +easy stage: -> needs patch type: behavior -> enhancement versions: +Python 2.7, Python 3.4 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17777> _______________________________________

Mark Egan-Fuller added the comment: Python correctly throws a unicode error here, directing the user towards the fact that this is an issue specifically with the unicode escaping.
The documentation also states that "Any Unicode character can be encoded this way. Exactly eight hex digits are required."[0]. Propose closing this as Won't Fix. [0]: http://docs.python.org/3/reference/lexical_analysis.html#literals ---------- nosy: +markeganfuller, tim.golden _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17777> _______________________________________

Tim Golden added the comment: Closing as "Works for me" in the absence of any clear proposal for docs improvement. ---------- resolution: -> works for me stage: needs patch -> committed/rejected status: open -> closed _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17777> _______________________________________
participants (5)
-
Ezio Melotti
-
Mark Egan-Fuller
-
R. David Murray
-
Reynir Reynisson
-
Tim Golden