On Thu, Aug 06, 2015 at 12:26:14PM -0400, random832@fastmail.us wrote:
On Wed, Aug 5, 2015, at 14:56, Eric V. Smith wrote:
Because strings containing \{ are currently valid
Which raises the question of why.
Because \C is currently valid, for all values of C. The idea is that if you typo an escape, say \d for \f, you get an obvious backslash in your string which is easy to spot. Personally, I think that's a mistake. It leads to errors like this: filename = 'C:\some\path\something.txt' silently doing the wrong thing. If we're going to change the way escapes work, it's time to deprecate the misfeature that \C is a literal backslash followed by C. Outside of raw strings, a backslash should *only* be allowed in an escape sequence. Deprecating invalid escape sequences would then open the door to adding new, useful escapes.
(and as long as we're talking about things to deprecate in string literals, how about \v?)
Why would you want to deprecate a useful and long-standing escape sequence? Admittedly \v isn't as common as \t or \n, but it still has its uses, and is a standard escape familiar to anyone who uses C, C++, C#, Octave, Haskell, Javascript, etc. If we're going to make major changes to the way escapes work, I'd rather add new escapes, not take them away: \e escape \x1B, as supported by gcc and clang; the escaping rules from Haskell: http://book.realworldhaskell.org/read/characters-strings-and-escaping-rules.... \P platform-specific newline (e.g. \r\n on Windows, \n on POSIX) \U+xxxx Unicode code point U+xxxx (with four to six hex digits) It's much nicer to be able to write Unicode code points that (apart from the backslash) look like the standard Unicode notation U+0000 to U+10FFFF, rather than needing to pad to a full eight digits as the \U00xxxxxx syntax requires. -- Steve