[Python-ideas] Make non-meaningful backslashes illegal in string literals

MRAB python at mrabarnett.plus.com
Fri Aug 7 18:43:45 CEST 2015


On 2015-08-07 06:12, Steven D'Aprano wrote:
> On Thu, Aug 06, 2015 at 12:26:14PM -0400, random832 at fastmail.us wrote:
>> On Wed, Aug 5, 2015, at 14:56, Eric V. Smith wrote:
>> > Because strings containing \{ are currently valid
>>
>> Which raises the question of why.
>
> Because \C is currently valid, for all values of C. The idea is that if
> you typo an escape, say \d for \f, you get an obvious backslash in your
> string which is easy to spot.
>
> Personally, I think that's a mistake. It leads to errors like this:
>
> filename = 'C:\some\path\something.txt'
>
> silently doing the wrong thing. If we're going to change the way escapes
> work, it's time to deprecate the misfeature that \C is a literal
> backslash followed by C. Outside of raw strings, a backslash should
> *only* be allowed in an escape sequence.
>
> Deprecating invalid escape sequences would then open the door to adding
> new, useful escapes.
>
>
>> (and as long as we're talking about
>> things to deprecate in string literals, how about \v?)
>
> Why would you want to deprecate a useful and long-standing escape
> sequence? Admittedly \v isn't as common as \t or \n, but it still has
> its uses, and is a standard escape familiar to anyone who uses C, C++,
> C#, Octave, Haskell, Javascript, etc.
>
> If we're going to make major changes to the way escapes work, I'd rather
> add new escapes, not take them away:
>
>
> \e escape \x1B, as supported by gcc and clang;
>
> the escaping rules from Haskell:
>
> http://book.realworldhaskell.org/read/characters-strings-and-escaping-rules.html
>
> \P platform-specific newline (e.g. \r\n on Windows, \n on POSIX)
>
> \U+xxxx Unicode code point U+xxxx (with four to six hex digits)
>
> It's much nicer to be able to write Unicode code points that (apart from
> the backslash) look like the standard Unicode notation U+0000 to
> U+10FFFF, rather than needing to pad to a full eight digits as the
> \U00xxxxxx syntax requires.
>
Some other languages, such as Perl, have \x{...}, so that would be 
\x{10FFF}.



More information about the Python-ideas mailing list