[Python-ideas] Make undefined escape sequences have SyntaxWarnings

Greg Ewing greg.ewing at canterbury.ac.nz
Thu Oct 11 07:34:32 CEST 2012


Steven D'Aprano wrote:
> If you escape a character, you should get
> something. If it's a special character, you get the special meaning.
> If it's not, escaping should be transparent: escaping something that
> doesn't need escaping is a null op

I think that calling "\n", "\t" etc. "escape sequences" is a misnomer
that is causing confusion in this discussion.

The term "escape" in this context means to prevent something from
having a special meaning that it would otherwise have. But the
backslash in these is being used to *give* a special meaning to
the following character.

In Python string literals, the only true escape sequences associated
with the backslash are '\\', "\'" and '\"'.

So the backslash is a bit schizophrenic -- sometimes it's an escape
character, sometimes it's a prefix that imparts a special meaning.

This means that "\c" where c is not special in any way is somewhat
ambiguous. Are you redundantly escaping something that doesn't
need it, are you asking for a special meaning that doesn't exist
(which is probably a mistake), or do you just want a literal
backslash?

Python guesses that you want a literal backslash. This seems to be
motivated by the desire to minimise the need for backslash doubling.
That sounds fine in theory, but I don't think it helps much in
practice. I for one don't trust myself to keep the entire set of
special characters in my head, including all the rarely-used ones,
so I end up doubling every backslash anyway.

Given that, I wouldn't have minded at all if Python had refused
to guess in this case, and raised a compile-time error. That would
have left the way open for extending the set of special chars in
the future.

> Adding a new escape sequence is almost as big a step as adding a new
> built-in or new syntax. I see that as a good thing, it discourages too
> many requests for new escape sequences.

I don't see it makes much difference. We get plenty of requests for
new syntax of all kinds, and we seem to have enough sense to reject
them unless they're backed by extremely good arguments. There's no
reason requests for new special chars should be treated any differently.

-- 
Greg



More information about the Python-ideas mailing list