On 8/9/2019 9:08 AM, Nick Coghlan wrote:
On Sat, 10 Aug 2019 at 01:44, Guido van Rossum <guido@python.org> wrote:
This discussion looks like there's no end in sight. Maybe the Steering Council should take a vote? I find the "Our deprecation warnings were even less visible than normal" argument for extending the deprecation period compelling.
I also think the UX of the warning itself could be reviewed to provide a more explicit nudge towards using raw strings when folks want to allow arbitrary embedded backslashes. Consider:
SyntaxWarning: invalid escape sequence \,
vs something like:
SyntaxWarning: invalid escape sequence \, (Note: adding the raw string literal prefix, r, will accept all non-trailing backslashes)
After all, the habit we're trying to encourage is "If I want to include backslashes without escaping them all, I should use a raw string", not "I should memorize the entire set of valid escape sequences" or even "I should always escape backslashes".
Cheers, Nick.
The reason I never use raw strings is in the documentation, it is because \ still has a special meaning, and the first several times I felt the need for raw strings, it was for directory names that wanted to end with \ and couldn't. Quoted below. Also relevant to the discussion is the "benefit" of leaving the backslash in the result of an illegal escape, which no one has mentioned in this huge thread.
Unlike Standard C, all unrecognized escape sequences are left in the string unchanged, i.e., /the backslash is left in the result/. (This behavior is useful when debugging: if an escape sequence is mistyped, the resulting output is more easily recognized as broken.) It is also important to note that the escape sequences only recognized in string literals fall into the category of unrecognized escapes for bytes literals.
Changed in version 3.6: Unrecognized escape sequences produce a DeprecationWarning. In some future version of Python they will be a SyntaxError.
Even in a raw literal, quotes can be escaped with a backslash, but the backslash remains in the result; for example, |r"\""| is a valid string literal consisting of two characters: a backslash and a double quote; |r"\"| is not a valid string literal (even a raw string cannot end in an odd number of backslashes). Specifically, /a raw literal cannot end in a single backslash/ (since the backslash would escape the following quote character). Note also that a single backslash followed by a newline is interpreted as those two characters as part of the literal, /not/ as a line continuation.