[Python-ideas] PEP 8: raw strings & regular expressions

Andrew Barnert abarnert at yahoo.com
Thu Oct 22 06:00:20 EDT 2015


On Oct 22, 2015, at 00:44, Terry Reedy <tjreedy at udel.edu> wrote:
> 
>> On 10/21/2015 10:24 PM, Yury Selivanov wrote:
> 
>> I think that it might be a good idea to state the following in PEP 8:
>> 
>> - use r'...' strings for raw strings that describe regular expressions;
>> these strings might be highlighted specially in some editors.
>> 
>> - use R'...' strings for raw strings; editors *should not* highlight
>> any escaped characters in them.
>> 
>> What do you think?
> 
> I think it a bad idea.  For beginners on Windows, r'windows\path\file.py' might be more common than r're'.  

It's also worth noting that an awful lot of code that uses Windows pathnames is either beginner code, local scripts, or closed-source commercial code, which means a typical code search is probably going to vastly underrepresent how common they are in raw strings.

(Of course that same fact means it may be perfectly reasonable for GitHub to assume raw strings are regexps rather than Windows pathnames, even if it isn't reasonable for Python itself, or general-purpose tools like IDLE…)

> I have never seen R used.
> 
> If you wanted to promote the use of the currently rare R for REs, and have editors specially mark raw literals with this special prefix, I would not mind.

That doesn't sound as bad.

But I still don't like it. Where else does Python provide two equivalent ways to do something, specifically to support external semantic connotations? It's like having <> and != both mean the same thing to support people coming up with some language-external difference between the spellings.

(Yes, I realize there are a few cases like this—e.g., someone could use the fact that int and 'int' annotate the same type to give them different connotations—but those are accidental effects of some other language feature; PEP 8 certainly isn't going to suggest using 'int' to mean one thing and int another.)

If we really want there to be a difference, we should have a regex literal syntax—maybe an x or s prefix or something—in place of re.compile(r'…'). 


More information about the Python-ideas mailing list