[Python-ideas] Make undefined escape sequences have SyntaxWarnings

Steven D'Aprano steve at pearwood.info
Thu Oct 11 04:58:33 CEST 2012


On 11/10/12 13:24, Mike Graham wrote:
> On Wed, Oct 10, 2012 at 8:08 PM, Steven D'Aprano<steve at pearwood.info>  wrote:
>> On 11/10/12 07:08, Mike Graham wrote:
>>>
>>> On Wed, Oct 10, 2012 at 3:46 PM, Antoine Pitrou<solipsis at pitrou.net>
>>> wrote:
>>>>
>>>> On Wed, 10 Oct 2012 15:36:08 -0400
>>>> Mike Graham<mikegraham at gmail.com>   wrote:
>>
>>
>>>>> The literal"\c" should be an error
>>
>>
>> Who says so? My bash shell disagrees with you:
>
> Frankly, I don't look to bash for sensible language design advice.

Pity, because in this case I think bash is actually more sensible than
either Python or Java. If you escape a character, you should get
something. If it's a special character, you get the special meaning.
If it's not, escaping should be transparent: escaping something that
doesn't need escaping is a null op:

py> from urllib import quote_plus
py> quote_plus('abc')
'abc'


If we were designing Python from scratch, I'd prefer '\D' -> 'D'. But
we're not, so I'm happy with the current behaviour, and don't agree that
it should be an error or that it needs warning about.


> I
>  think concepts like "In the face of ambiguity, refuse the temptation
> to guess" guides how we should see the decision here.

Where is the ambiguity? Is there ever a context where \D could mean
two different things and it isn't clear which one?

"In the face of ambiguity..." does not mean "refuse to decide on
language behaviour". Everything is ambiguous until you decide what
something will mean. It's only when you have two possible meanings
and no clear, obvious way to determine which one applies that the
ambiguity koan applies.


> "Backslash is
> for escape sequences except when it's not" seemed like an
> obviously-misfortunate thing to me.

No. In cooked strings, backslash-C is always an escape sequence, for
any character (or hex/oct code) C. But some escape sequences resolve
to a single char (\n -> newline) and some resolve to a pair of chars
(\D -> backslash D). In Haskell, \& resolves to the empty string.
It's still an escape sequence.



[...]
> I think four string escapes have been added since versions of Python I
> was aware of. Writing code like "ab\c" seems seedy in light of that

Adding a new escape sequence is almost as big a step as adding a new
built-in or new syntax. I see that as a good thing, it discourages too
many requests for new escape sequences.


-- 
Steven



More information about the Python-ideas mailing list