[Python-ideas] PEP 8: raw strings & regular expressions

Yury Selivanov yselivanov.ml at gmail.com
Wed Oct 21 22:53:05 EDT 2015


On 2015-10-21 10:44 PM, Ben Finney wrote:
>> >In the process, we had to make a decision on how to highlight raw
>> >string literals -- r''. Many existing highlighters assume that all raw
>> >strings are regexps, and highlight them as such, i.e. '\s' and '\n'
>> >will be highlighted.
> That is evidently a simple mistake. Merely knowing that a token is a raw
> string does not justify the assumption that the string is a regular
> expression, or a filesystem entry name, or a line in a network protocol,
> or anything except plain text.

I agree 100%.

But: github, gitlab, Atom, Sublime Text, and many other tools
assume that raw strings (with lowercase r) are regexps.  If you
don't highlight them as such, people think that it's a bug.

Since I wanted MagicPython to be a drop-in replacement for
standard highlighters, I simply *could not* change this
behavior.  It's already a standard of some sorts, whether we
like it or not.

> Perhaps some more explicit context could be used to signal what the
> intent of a raw string is, but you'd need to find a strong consensus
> that programmers actually intend that. “It's a raw string” doesn't
> justify any of those assumptions.
>

If we want to design some special marker for highlighters to
hint what language is in the string, I'd strongly suggest that
it should be before the string literal.  For instance, it
*won't* be possible for most highlighters to detect this:

    my_re = '''...
    ...''' # regex

Yury


More information about the Python-ideas mailing list