Raw strings ending with a backslash
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
Now that we have a new parser for CPython, can we fix the old gotcha that raw strings cannot end in a backslash? Its an FAQ and has come up again on the bug tracker. https://docs.python.org/3/faq/design.html#id26 https://github.com/python/cpython/issues/93314 -- Steve
data:image/s3,"s3://crabby-images/98c42/98c429f8854de54c6dfbbe14b9c99e430e0e4b7d" alt=""
28.05.22 12:22, Steven D'Aprano пише:
Now that we have a new parser for CPython, can we fix the old gotcha that raw strings cannot end in a backslash?
Its an FAQ and has come up again on the bug tracker.
I do not think that we can allow this, and it is not related to parser. Few years ago I experimented with such change: https://github.com/python/cpython/pull/15217 You can see that it breaks even some stdlib code, and it will definitely break many third-party packages and examples. Technically we can do this, but the benefit is small in comparison with the cost.
data:image/s3,"s3://crabby-images/832ad/832ad4d5f50c86e1010c600d6c16ad97ebabdee0" alt=""
That PR seems to make \' and \" not special in general right? I think this is a more limited proposal, to only change the behavior when \ is at the end of a string, so the only behavior difference would never receiving the error "SyntaxError: EOL while scanning string literal" In which case there should be no backwards compatibility issue. Damian On Sat, May 28, 2022 at 12:20 PM Serhiy Storchaka <storchaka@gmail.com> wrote:
28.05.22 12:22, Steven D'Aprano пише:
Now that we have a new parser for CPython, can we fix the old gotcha that raw strings cannot end in a backslash?
Its an FAQ and has come up again on the bug tracker.
I do not think that we can allow this, and it is not related to parser.
Few years ago I experimented with such change: https://github.com/python/cpython/pull/15217
You can see that it breaks even some stdlib code, and it will definitely break many third-party packages and examples. Technically we can do this, but the benefit is small in comparison with the cost.
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/WWWRFQK4... Code of Conduct: http://python.org/psf/codeofconduct/
data:image/s3,"s3://crabby-images/fd43a/fd43a1cccdc1d153ee8e72a25e677f0751134ccc" alt=""
Personally I'd expect these two lines to do the same thing, whatever that thing is: path = 'C:\' path = ('C:\') Barney On Sat, 28 May 2022 at 12:59, Damian Shaw <damian.peter.shaw@gmail.com> wrote:
That PR seems to make \' and \" not special in general right?
I think this is a more limited proposal, to only change the behavior when \ is at the end of a string, so the only behavior difference would never receiving the error "SyntaxError: EOL while scanning string literal"
In which case there should be no backwards compatibility issue.
Damian
On Sat, May 28, 2022 at 12:20 PM Serhiy Storchaka <storchaka@gmail.com> wrote:
28.05.22 12:22, Steven D'Aprano пише:
Now that we have a new parser for CPython, can we fix the old gotcha that raw strings cannot end in a backslash?
Its an FAQ and has come up again on the bug tracker.
I do not think that we can allow this, and it is not related to parser.
Few years ago I experimented with such change: https://github.com/python/cpython/pull/15217
You can see that it breaks even some stdlib code, and it will definitely break many third-party packages and examples. Technically we can do this, but the benefit is small in comparison with the cost.
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/WWWRFQK4... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/O3STZ54B... Code of Conduct: http://python.org/psf/codeofconduct/
data:image/s3,"s3://crabby-images/98c42/98c429f8854de54c6dfbbe14b9c99e430e0e4b7d" alt=""
28.05.22 14:57, Damian Shaw пише:
That PR seems to make \' and \" not special in general right?
I think this is a more limited proposal, to only change the behavior when \ is at the end of a string, so the only behavior difference would never receiving the error "SyntaxError: EOL while scanning string literal"
In which case there should be no backwards compatibility issue.
How do you know that it is at the end of a string?
data:image/s3,"s3://crabby-images/2eb67/2eb67cbdf286f4b7cb5a376d9175b1c368b87f28" alt=""
On 2022-05-28 13:17, Serhiy Storchaka wrote:
28.05.22 14:57, Damian Shaw пише:
That PR seems to make \' and \" not special in general right?
I think this is a more limited proposal, to only change the behavior when \ is at the end of a string, so the only behavior difference would never receiving the error "SyntaxError: EOL while scanning string literal"
In which case there should be no backwards compatibility issue.
How do you know that it is at the end of a string?
It would also affect triple-quoted strings. Here's an idea: prefix rr ("really raw") that would treat all backslashes literally.
data:image/s3,"s3://crabby-images/2eb67/2eb67cbdf286f4b7cb5a376d9175b1c368b87f28" alt=""
On 2022-05-28 16:03, MRAB wrote:
On 2022-05-28 13:17, Serhiy Storchaka wrote:
28.05.22 14:57, Damian Shaw пише:
That PR seems to make \' and \" not special in general right?
I think this is a more limited proposal, to only change the behavior when \ is at the end of a string, so the only behavior difference would never receiving the error "SyntaxError: EOL while scanning string literal"
In which case there should be no backwards compatibility issue.
How do you know that it is at the end of a string?
It would also affect triple-quoted strings.
Here's an idea: prefix rr ("really raw") that would treat all backslashes literally. Here's something I've just realised.
Names in Python are case-sensitive, yet the string prefixes are case-/insensitive/. Why?
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, 29 May 2022 at 05:05, MRAB <python@mrabarnett.plus.com> wrote:
On 2022-05-28 16:03, MRAB wrote:
On 2022-05-28 13:17, Serhiy Storchaka wrote:
28.05.22 14:57, Damian Shaw пише:
That PR seems to make \' and \" not special in general right?
I think this is a more limited proposal, to only change the behavior when \ is at the end of a string, so the only behavior difference would never receiving the error "SyntaxError: EOL while scanning string literal"
In which case there should be no backwards compatibility issue.
How do you know that it is at the end of a string?
It would also affect triple-quoted strings.
Here's an idea: prefix rr ("really raw") that would treat all backslashes literally. Here's something I've just realised.
Names in Python are case-sensitive, yet the string prefixes are case-/insensitive/.
Why?
Technically they're not, but there are aliases. Kinda like threading.currentThread(). ChrisA
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Sat, May 28, 2022 at 12:11 MRAB Names in Python are case-sensitive, yet the string prefixes are
case-/insensitive/.
Why?
IIRC we copied this from C for numeric suffixes (0l and 0L are the same; also hex digits and presumably 0XA == 0xa) and then copied that for string prefixes without thinking about it much. I guess it’s too late to change. —Guido -- --Guido (mobile)
data:image/s3,"s3://crabby-images/f81c3/f81c349b494ddf4b2afda851969a1bfe75852ddf" alt=""
On Sat, May 28, 2022 at 12:55 PM Guido van Rossum <guido@python.org> wrote:
On Sat, May 28, 2022 at 12:11 MRAB
Names in Python are case-sensitive, yet the string prefixes are
case-/insensitive/.
Why?
IIRC we copied this from C for numeric suffixes (0l and 0L are the same; also hex digits and presumably 0XA == 0xa) and then copied that for string prefixes without thinking about it much. I guess it’s too late to change.
Given that 99.99% of code uses lower case string prefixes we *could* change it, it'd just take a longer deprecation cycle - you'd probably want a few releases where the upper case prefixes become an error in files without a `from __future__ import case_sensitive_quote_prefixes` rather than jumping straight from parse time DeprecationWarning to repurposing the uppercase to have a new meaning. The inertia behind doing that over the course of 5+ years is high. Implying that we'd need a compelling reason to orchestrate it. None has sprung up. -gps
—Guido -- --Guido (mobile) _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/27HLMPDU... Code of Conduct: http://python.org/psf/codeofconduct/
data:image/s3,"s3://crabby-images/832ad/832ad4d5f50c86e1010c600d6c16ad97ebabdee0" alt=""
My understanding was that was part of the question being asked, is it possible to know what with the new PEG parser? On Sat, May 28, 2022 at 1:25 PM Serhiy Storchaka <storchaka@gmail.com> wrote:
That PR seems to make \' and \" not special in general right?
I think this is a more limited proposal, to only change the behavior when \ is at the end of a string, so the only behavior difference would never receiving the error "SyntaxError: EOL while scanning string
28.05.22 14:57, Damian Shaw пише: literal"
In which case there should be no backwards compatibility issue.
How do you know that it is at the end of a string?
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BFGT3H57... Code of Conduct: http://python.org/psf/codeofconduct/
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 5/28/2022 7:57 AM, Damian Shaw wrote:
That PR seems to make \' and \" not special in general right?
I think this is a more limited proposal, to only change the behavior when \ is at the end of a string, so the only behavior difference would never receiving the error "SyntaxError: EOL while scanning string literal"
How would you know where the end of a string is? I think this is one of those things that's easy to look at for a human and figure out the intent, but not so easy for the lexer, without some heuristics and backtracking. If the trailing single quote is removed below, it changes from "backslash in the middle of a string" to "backslash at the end of a string, followed by an arbitrary expression. r'\' + "foo"' Eric
In which case there should be no backwards compatibility issue.
Damian
On Sat, May 28, 2022 at 12:20 PM Serhiy Storchaka <storchaka@gmail.com> wrote:
28.05.22 12:22, Steven D'Aprano пише: > Now that we have a new parser for CPython, can we fix the old gotcha > that raw strings cannot end in a backslash? > > Its an FAQ and has come up again on the bug tracker. > > https://docs.python.org/3/faq/design.html#id26 > > https://github.com/python/cpython/issues/93314
I do not think that we can allow this, and it is not related to parser.
Few years ago I experimented with such change: https://github.com/python/cpython/pull/15217
You can see that it breaks even some stdlib code, and it will definitely break many third-party packages and examples. Technically we can do this, but the benefit is small in comparison with the cost.
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/WWWRFQK4... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-Dev mailing list --python-dev@python.org To unsubscribe send an email topython-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived athttps://mail.python.org/archives/list/python-dev@python.org/message/O3STZ54B... Code of Conduct:http://python.org/psf/codeofconduct/
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
Thank you to everyone who responded, it is now clear to me that this genuinely is a feature, not a bug or limitation of the parser or lexer. And that there is code relying on that behaviour, including in the stdlib, so we shouldn't change it even if we could. -- Steve
participants (9)
-
Barney Gale
-
Chris Angelico
-
Damian Shaw
-
Eric V. Smith
-
Gregory P. Smith
-
Guido van Rossum
-
MRAB
-
Serhiy Storchaka
-
Steven D'Aprano