[Python-Dev] Raw string syntax inconsistency

"Martin v. Löwis" martin at v.loewis.de
Mon Jun 18 08:06:34 CEST 2012


> But the whole point of the reintroduction of u"..." is to support code
> that isn't run through 2to3. Frankly, I don't care how it's done, but
> I'd say it's important not to silently have different behavior for the
> same notation in the two versions. If that means we have to add an extra
> step to the compiler to reject r"\u03b3", so be it.

It's actually ur"\u03b3" that will be rejected, and that falls out
easily by just not being able to parse it. The 2.x r"\u03b3" denotes
a 6-character (byte) string, which continues to be understood as a
6-character Unicode string in 3.3.

> Hm. I still encounter enough environments that don't know how to display
> such characters that I would prefer to have a rock solid \u escape
> mechanism.

If you want to use them under the revised PEP 414, you will have to
avoid making them raw, and just use a plain u prefix. IOW, you need
to double all backslashes that you want to stand on their own, and
then use \u escapes to denote non-typable characters.


> Yeah, but if you do this and it breaks you likely won't notice until way
> late in your QA cycle, when it may be tough to track down the origin.
> I'd rather make ru"\u03b3" a syntax error if we can't give it the same
> meaning as in Python 2.

That's exactly the proposal, see

http://bugs.python.org/issue15096
http://bugs.python.org/file26036/issue15096-1.patch

Regards,
Martin


More information about the Python-Dev mailing list