Mailman 3 Raw strings ending with a backslash - Python-Dev

newer
Is it possible to view tokenizer...

Raw strings ending with a backslash

older
spling1975@gmail.com

Steven D'Aprano

May 28, 2022

9:22 a.m.

Now that we have a new parser for CPython, can we fix the old gotcha that raw strings cannot end in a backslash? Its an FAQ and has come up again on the bug tracker. https://docs.python.org/3/faq/design.html#id26 https://github.com/python/cpython/issues/93314 -- Steve

Show replies by date

Serhiy Storchaka

May 2022

11:13 a.m.

28.05.22 12:22, Steven D'Aprano пише:

...

Now that we have a new parser for CPython, can we fix the old gotcha that raw strings cannot end in a backslash?

Its an FAQ and has come up again on the bug tracker.

https://docs.python.org/3/faq/design.html#id26

https://github.com/python/cpython/issues/93314

I do not think that we can allow this, and it is not related to parser. Few years ago I experimented with such change: https://github.com/python/cpython/pull/15217 You can see that it breaks even some stdlib code, and it will definitely break many third-party packages and examples. Technically we can do this, but the benefit is small in comparison with the cost.

Damian Shaw

11:57 a.m.

That PR seems to make \' and \" not special in general right? I think this is a more limited proposal, to only change the behavior when \ is at the end of a string, so the only behavior difference would never receiving the error "SyntaxError: EOL while scanning string literal" In which case there should be no backwards compatibility issue. Damian On Sat, May 28, 2022 at 12:20 PM Serhiy Storchaka <storchaka@gmail.com> wrote:

...

28.05.22 12:22, Steven D'Aprano пише:

...
Now that we have a new parser for CPython, can we fix the old gotcha that raw strings cannot end in a backslash?

Its an FAQ and has come up again on the bug tracker.

https://docs.python.org/3/faq/design.html#id26

https://github.com/python/cpython/issues/93314

I do not think that we can allow this, and it is not related to parser.

Few years ago I experimented with such change: https://github.com/python/cpython/pull/15217

You can see that it breaks even some stdlib code, and it will definitely break many third-party packages and examples. Technically we can do this, but the benefit is small in comparison with the cost.

_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/WWWRFQK4... Code of Conduct: http://python.org/psf/codeofconduct/

Barney Gale

12:08 p.m.

Personally I'd expect these two lines to do the same thing, whatever that thing is: path = 'C:\' path = ('C:\') Barney On Sat, 28 May 2022 at 12:59, Damian Shaw <damian.peter.shaw@gmail.com> wrote:

...

That PR seems to make \' and \" not special in general right?

I think this is a more limited proposal, to only change the behavior when \ is at the end of a string, so the only behavior difference would never receiving the error "SyntaxError: EOL while scanning string literal"

In which case there should be no backwards compatibility issue.

Damian

On Sat, May 28, 2022 at 12:20 PM Serhiy Storchaka <storchaka@gmail.com> wrote:

...
28.05.22 12:22, Steven D'Aprano пише:

...
Now that we have a new parser for CPython, can we fix the old gotcha that raw strings cannot end in a backslash?

Its an FAQ and has come up again on the bug tracker.

https://docs.python.org/3/faq/design.html#id26

https://github.com/python/cpython/issues/93314

I do not think that we can allow this, and it is not related to parser.

Few years ago I experimented with such change: https://github.com/python/cpython/pull/15217

You can see that it breaks even some stdlib code, and it will definitely break many third-party packages and examples. Technically we can do this, but the benefit is small in comparison with the cost.

_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/WWWRFQK4... Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/O3STZ54B... Code of Conduct: http://python.org/psf/codeofconduct/

Serhiy Storchaka

12:17 p.m.

28.05.22 14:57, Damian Shaw пише:

...

That PR seems to make \' and \" not special in general right?

I think this is a more limited proposal, to only change the behavior when \ is at the end of a string, so the only behavior difference would never receiving the error "SyntaxError: EOL while scanning string literal"

In which case there should be no backwards compatibility issue.

How do you know that it is at the end of a string?

MRAB

3:03 p.m.

On 2022-05-28 13:17, Serhiy Storchaka wrote:

...

28.05.22 14:57, Damian Shaw пише:

...
That PR seems to make \' and \" not special in general right?

I think this is a more limited proposal, to only change the behavior when \ is at the end of a string, so the only behavior difference would never receiving the error "SyntaxError: EOL while scanning string literal"

In which case there should be no backwards compatibility issue.

How do you know that it is at the end of a string?

It would also affect triple-quoted strings. Here's an idea: prefix rr ("really raw") that would treat all backslashes literally.

MRAB

7:01 p.m.

New subject: [OT] Re: Raw strings ending with a backslash

On 2022-05-28 16:03, MRAB wrote:

...

On 2022-05-28 13:17, Serhiy Storchaka wrote:

...
28.05.22 14:57, Damian Shaw пише:

...
That PR seems to make \' and \" not special in general right?

I think this is a more limited proposal, to only change the behavior when \ is at the end of a string, so the only behavior difference would never receiving the error "SyntaxError: EOL while scanning string literal"

In which case there should be no backwards compatibility issue.

How do you know that it is at the end of a string?

It would also affect triple-quoted strings.

Here's an idea: prefix rr ("really raw") that would treat all backslashes literally. Here's something I've just realised.

Names in Python are case-sensitive, yet the string prefixes are case-/insensitive/. Why?

Chris Angelico

7:17 p.m.

New subject: [OT] Re: Raw strings ending with a backslash

On Sun, 29 May 2022 at 05:05, MRAB <python@mrabarnett.plus.com> wrote:

...

On 2022-05-28 16:03, MRAB wrote:

...
On 2022-05-28 13:17, Serhiy Storchaka wrote:

...
28.05.22 14:57, Damian Shaw пише:

...
That PR seems to make \' and \" not special in general right?

I think this is a more limited proposal, to only change the behavior when \ is at the end of a string, so the only behavior difference would never receiving the error "SyntaxError: EOL while scanning string literal"

In which case there should be no backwards compatibility issue.

How do you know that it is at the end of a string?

It would also affect triple-quoted strings.

Here's an idea: prefix rr ("really raw") that would treat all backslashes literally. Here's something I've just realised.

Names in Python are case-sensitive, yet the string prefixes are case-/insensitive/.

Why?

Technically they're not, but there are aliases. Kinda like threading.currentThread(). ChrisA

Guido van Rossum

7:49 p.m.

New subject: [OT] Re: Raw strings ending with a backslash

On Sat, May 28, 2022 at 12:11 MRAB Names in Python are case-sensitive, yet the string prefixes are

...

case-/insensitive/.

Why?

IIRC we copied this from C for numeric suffixes (0l and 0L are the same; also hex digits and presumably 0XA == 0xa) and then copied that for string prefixes without thinking about it much. I guess it’s too late to change. —Guido -- --Guido (mobile)

Gregory P. Smith

8:19 p.m.

New subject: [OT] Re: Raw strings ending with a backslash

On Sat, May 28, 2022 at 12:55 PM Guido van Rossum <guido@python.org> wrote:

...

On Sat, May 28, 2022 at 12:11 MRAB

Names in Python are case-sensitive, yet the string prefixes are

...
case-/insensitive/.

Why?

IIRC we copied this from C for numeric suffixes (0l and 0L are the same; also hex digits and presumably 0XA == 0xa) and then copied that for string prefixes without thinking about it much. I guess it’s too late to change.

Given that 99.99% of code uses lower case string prefixes we *could* change it, it'd just take a longer deprecation cycle - you'd probably want a few releases where the upper case prefixes become an error in files without a `from __future__ import case_sensitive_quote_prefixes` rather than jumping straight from parse time DeprecationWarning to repurposing the uppercase to have a new meaning. The inertia behind doing that over the course of 5+ years is high. Implying that we'd need a compelling reason to orchestrate it. None has sprung up. -gps

...

—Guido -- --Guido (mobile) _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/27HLMPDU... Code of Conduct: http://python.org/psf/codeofconduct/

Damian Shaw

3:03 p.m.

My understanding was that was part of the question being asked, is it possible to know what with the new PEG parser? On Sat, May 28, 2022 at 1:25 PM Serhiy Storchaka <storchaka@gmail.com> wrote:

...

...
That PR seems to make \' and \" not special in general right?

I think this is a more limited proposal, to only change the behavior when \ is at the end of a string, so the only behavior difference would never receiving the error "SyntaxError: EOL while scanning string

28.05.22 14:57, Damian Shaw пише: literal"

...
In which case there should be no backwards compatibility issue.

How do you know that it is at the end of a string?

_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BFGT3H57... Code of Conduct: http://python.org/psf/codeofconduct/

Serhiy Storchaka

6:16 p.m.

28.05.22 18:03, Damian Shaw пише:

...

My understanding was that was part of the question being asked, is it possible to know what with the new PEG parser?

You first need to define what is the end of a string. And I think it is not relevant to the grammar parser.

Eric V. Smith

12:24 p.m.

On 5/28/2022 7:57 AM, Damian Shaw wrote:

...

That PR seems to make \' and \" not special in general right?

I think this is a more limited proposal, to only change the behavior when \ is at the end of a string, so the only behavior difference would never receiving the error "SyntaxError: EOL while scanning string literal"

How would you know where the end of a string is? I think this is one of those things that's easy to look at for a human and figure out the intent, but not so easy for the lexer, without some heuristics and backtracking. If the trailing single quote is removed below, it changes from "backslash in the middle of a string" to "backslash at the end of a string, followed by an arbitrary expression. r'\' + "foo"' Eric

...

In which case there should be no backwards compatibility issue.

Damian

On Sat, May 28, 2022 at 12:20 PM Serhiy Storchaka <storchaka@gmail.com> wrote:

28.05.22 12:22, Steven D'Aprano пише: > Now that we have a new parser for CPython, can we fix the old gotcha > that raw strings cannot end in a backslash? > > Its an FAQ and has come up again on the bug tracker. > > https://docs.python.org/3/faq/design.html#id26 > > https://github.com/python/cpython/issues/93314

I do not think that we can allow this, and it is not related to parser.

Few years ago I experimented with such change: https://github.com/python/cpython/pull/15217

You can see that it breaks even some stdlib code, and it will definitely break many third-party packages and examples. Technically we can do this, but the benefit is small in comparison with the cost.

_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/WWWRFQK4... Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-Dev mailing list --python-dev@python.org To unsubscribe send an email topython-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived athttps://mail.python.org/archives/list/python-dev@python.org/message/O3STZ54B... Code of Conduct:http://python.org/psf/codeofconduct/

Steven D'Aprano

9:47 a.m.

Thank you to everyone who responded, it is now clear to me that this genuinely is a feature, not a bug or limitation of the parser or lexer. And that there is code relying on that behaviour, including in the stdlib, so we shouldn't change it even if we could. -- Steve

1000

Age (days ago)

1002

Last active (days ago)

List overview

Download

13 comments

9 participants

participants (9)

Barney Gale
Chris Angelico
Damian Shaw
Eric V. Smith
Gregory P. Smith
Guido van Rossum
MRAB
Serhiy Storchaka
Steven D'Aprano

Raw strings ending with a backslash

tags

participants (9)