Mailman 3 Dart like multi line strings identation - Python-ideas

Dart like multi line strings identation

older
Option of running shell/console...

Marius Räsener

March 31, 2018

7:50 a.m.

Hey List, this is my very first approach to suggest a Python improvement I'd think worth discussing. At some point, maybe with Dart 2.0 or a little earlier, Dart is now supporting multiline strings with "proper" identation (tried, but I can't find the according docs at the moment. probably due to the rather large changes related to dart 2.0 and outdated docs.) What I have in mind is probably best described with an Example: print(""" I am a multiline String. """) the closing quote defines the "margin indentation" - so in this example all lines would get reduces by their leading 4 spaces, resulting in a "clean" and unintended string. anyways, if dart or not, doesn't matter - I like the Idea and I think python3.x could benefit from it. If that's possible at all :) I could also imagine that this "indentation cleanup" only is applied if the last quotes are on their own line? Might be too complicated though, I can't estimated or understand this... thx for reading, Marius

Attachments:

attachment.htm (text/html — 1.3 KB)

Show replies by date

Ryan Gonzalez

March 2018

8:06 a.m.

I have to admit, regardless of how practical this is, it would surely get rid of a ton of textwrap.dedent calls all over the place... On March 31, 2018 9:50:43 AM Marius Räsener <m.raesener@gmail.com> wrote: Hey List, this is my very first approach to suggest a Python improvement I'd think worth discussing. At some point, maybe with Dart 2.0 or a little earlier, Dart is now supporting multiline strings with "proper" identation (tried, but I can't find the according docs at the moment. probably due to the rather large changes related to dart 2.0 and outdated docs.) What I have in mind is probably best described with an Example: print(""" I am a multiline String. """) the closing quote defines the "margin indentation" - so in this example all lines would get reduces by their leading 4 spaces, resulting in a "clean" and unintended string. anyways, if dart or not, doesn't matter - I like the Idea and I think python3.x could benefit from it. If that's possible at all :) I could also imagine that this "indentation cleanup" only is applied if the last quotes are on their own line? Might be too complicated though, I can't estimated or understand this... thx for reading, Marius _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -- Ryan (ライアン) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else https://refi64.com/

Robert Vanden Eynde

8:48 a.m.

So yes, currently you just do : import textwrap print(textwrap.dedent(""" I am A Line """)) So you'd want a string litteral ? print(d""" I am A Line """) Le sam. 31 mars 2018 à 17:06, Ryan Gonzalez <rymg19@gmail.com> a écrit :

...

I have to admit, regardless of how practical this is, it would surely get rid of a ton of textwrap.dedent calls all over the place...

On March 31, 2018 9:50:43 AM Marius Räsener <m.raesener@gmail.com> wrote:

...
Hey List,

this is my very first approach to suggest a Python improvement I'd think worth discussing.

At some point, maybe with Dart 2.0 or a little earlier, Dart is now supporting multiline strings with "proper" identation (tried, but I can't find the according docs at the moment. probably due to the rather large changes related to dart 2.0 and outdated docs.)

What I have in mind is probably best described with an Example:

print(""" I am a multiline String. """)

the closing quote defines the "margin indentation" - so in this example all lines would get reduces by their leading 4 spaces, resulting in a "clean" and unintended string.

anyways, if dart or not, doesn't matter - I like the Idea and I think python3.x could benefit from it. If that's possible at all :)

I could also imagine that this "indentation cleanup" only is applied if the last quotes are on their own line? Might be too complicated though, I can't estimated or understand this...

thx for reading, Marius

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

-- Ryan (ライアン) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else https://refi64.com/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

David Mertz

9:51 a.m.

I can currently write: from textwrap import dedent as d print(d(""" I am A Line """)) It doesn't feel like these hypothetical d-strings are with new syntax. On Sat, Mar 31, 2018, 11:49 AM Robert Vanden Eynde <robertve92@gmail.com> wrote:

...

So yes, currently you just do :

import textwrap

print(textwrap.dedent(""" I am A Line """))

So you'd want a string litteral ?

print(d""" I am A Line """)

Le sam. 31 mars 2018 à 17:06, Ryan Gonzalez <rymg19@gmail.com> a écrit :

...
I have to admit, regardless of how practical this is, it would surely get rid of a ton of textwrap.dedent calls all over the place...

On March 31, 2018 9:50:43 AM Marius Räsener <m.raesener@gmail.com> wrote:

...
Hey List,

this is my very first approach to suggest a Python improvement I'd think worth discussing.

At some point, maybe with Dart 2.0 or a little earlier, Dart is now supporting multiline strings with "proper" identation (tried, but I can't find the according docs at the moment. probably due to the rather large changes related to dart 2.0 and outdated docs.)

What I have in mind is probably best described with an Example:

print(""" I am a multiline String. """)

the closing quote defines the "margin indentation" - so in this example all lines would get reduces by their leading 4 spaces, resulting in a "clean" and unintended string.

anyways, if dart or not, doesn't matter - I like the Idea and I think python3.x could benefit from it. If that's possible at all :)

I could also imagine that this "indentation cleanup" only is applied if the last quotes are on their own line? Might be too complicated though, I can't estimated or understand this...

thx for reading, Marius

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

-- Ryan (ライアン) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else https://refi64.com/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Marius Räsener

10:02 a.m.

Hey David, hm, that's actually a nice way to solve this too I guess, besides the additional import and "string literal". but as I answered to robert before (did it wrong with who to answer, correct it just now so the mailing-list has the answer, too) was, that I don't have a string literal in mind for this. Like I don't see a reason why this couldn't be the default thing for all string literals? again, the Idea is just to use the closing quotes to determine the indentation length ... 2018-03-31 18:51 GMT+02:00 David Mertz <mertz@gnosis.cx>:

...

I can currently write:

from textwrap import dedent as d print(d(""" I am A Line """))

It doesn't feel like these hypothetical d-strings are with new syntax.

On Sat, Mar 31, 2018, 11:49 AM Robert Vanden Eynde <robertve92@gmail.com> wrote:

...
So yes, currently you just do :

import textwrap

print(textwrap.dedent(""" I am A Line """))

So you'd want a string litteral ?

print(d""" I am A Line """)

Le sam. 31 mars 2018 à 17:06, Ryan Gonzalez <rymg19@gmail.com> a écrit :

...
I have to admit, regardless of how practical this is, it would surely get rid of a ton of textwrap.dedent calls all over the place...

On March 31, 2018 9:50:43 AM Marius Räsener <m.raesener@gmail.com> wrote:

...
Hey List,

this is my very first approach to suggest a Python improvement I'd think worth discussing.

At some point, maybe with Dart 2.0 or a little earlier, Dart is now supporting multiline strings with "proper" identation (tried, but I can't find the according docs at the moment. probably due to the rather large changes related to dart 2.0 and outdated docs.)

What I have in mind is probably best described with an Example:

print(""" I am a multiline String. """)

the closing quote defines the "margin indentation" - so in this example all lines would get reduces by their leading 4 spaces, resulting in a "clean" and unintended string.

anyways, if dart or not, doesn't matter - I like the Idea and I think python3.x could benefit from it. If that's possible at all :)

I could also imagine that this "indentation cleanup" only is applied if the last quotes are on their own line? Might be too complicated though, I can't estimated or understand this...

thx for reading, Marius

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

-- Ryan (ライアン) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else https://refi64.com/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Todd

10:47 a.m.

It would radically change the meaning of every existing multi-line string. That is an enormous backwards-compatibility break. It might work as a __future__ import, though. On Sat, Mar 31, 2018, 13:03 Marius Räsener <m.raesener@gmail.com> wrote:

...

Hey David,

hm, that's actually a nice way to solve this too I guess, besides the additional import and "string literal".

but as I answered to robert before (did it wrong with who to answer, correct it just now so the mailing-list has the answer, too) was, that I don't have a string literal in mind for this.

Like I don't see a reason why this couldn't be the default thing for all string literals?

again, the Idea is just to use the closing quotes to determine the indentation length ...

2018-03-31 18:51 GMT+02:00 David Mertz <mertz@gnosis.cx>:

...
I can currently write:

from textwrap import dedent as d print(d(""" I am A Line """))

It doesn't feel like these hypothetical d-strings are with new syntax.

On Sat, Mar 31, 2018, 11:49 AM Robert Vanden Eynde <robertve92@gmail.com> wrote:

...
So yes, currently you just do :

import textwrap

print(textwrap.dedent(""" I am A Line """))

So you'd want a string litteral ?

print(d""" I am A Line """)

Le sam. 31 mars 2018 à 17:06, Ryan Gonzalez <rymg19@gmail.com> a écrit :

...
I have to admit, regardless of how practical this is, it would surely get rid of a ton of textwrap.dedent calls all over the place...

On March 31, 2018 9:50:43 AM Marius Räsener <m.raesener@gmail.com> wrote:

...
Hey List,

this is my very first approach to suggest a Python improvement I'd think worth discussing.

At some point, maybe with Dart 2.0 or a little earlier, Dart is now supporting multiline strings with "proper" identation (tried, but I can't find the according docs at the moment. probably due to the rather large changes related to dart 2.0 and outdated docs.)

What I have in mind is probably best described with an Example:

print(""" I am a multiline String. """)

the closing quote defines the "margin indentation" - so in this example all lines would get reduces by their leading 4 spaces, resulting in a "clean" and unintended string.

anyways, if dart or not, doesn't matter - I like the Idea and I think python3.x could benefit from it. If that's possible at all :)

I could also imagine that this "indentation cleanup" only is applied if the last quotes are on their own line? Might be too complicated though, I can't estimated or understand this...

thx for reading, Marius

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

-- Ryan (ライアン) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else https://refi64.com/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Marius Räsener

11:14 a.m.

Oh, ok... yeah didn‘t think of that. Except I guess I‘d assume that so far multiline strings are either with textwrap or ‚don‘t care‘? Maybe? But sure, with that in mind it gets more tricky Todd <toddrjen@gmail.com> schrieb am Sa. 31. März 2018 um 19:49:

...

It would radically change the meaning of every existing multi-line string. That is an enormous backwards-compatibility break. It might work as a __future__ import, though.

On Sat, Mar 31, 2018, 13:03 Marius Räsener <m.raesener@gmail.com> wrote:

...
Hey David,

hm, that's actually a nice way to solve this too I guess, besides the additional import and "string literal".

but as I answered to robert before (did it wrong with who to answer, correct it just now so the mailing-list has the answer, too) was, that I don't have a string literal in mind for this.

Like I don't see a reason why this couldn't be the default thing for all string literals?

again, the Idea is just to use the closing quotes to determine the indentation length ...

2018-03-31 18:51 GMT+02:00 David Mertz <mertz@gnosis.cx>:

...
I can currently write:

from textwrap import dedent as d print(d(""" I am A Line """))

It doesn't feel like these hypothetical d-strings are with new syntax.

On Sat, Mar 31, 2018, 11:49 AM Robert Vanden Eynde <robertve92@gmail.com> wrote:

...
So yes, currently you just do :

import textwrap

print(textwrap.dedent(""" I am A Line """))

So you'd want a string litteral ?

print(d""" I am A Line """)

Le sam. 31 mars 2018 à 17:06, Ryan Gonzalez <rymg19@gmail.com> a écrit :

...
I have to admit, regardless of how practical this is, it would surely get rid of a ton of textwrap.dedent calls all over the place...

On March 31, 2018 9:50:43 AM Marius Räsener <m.raesener@gmail.com> wrote:

...
Hey List,

this is my very first approach to suggest a Python improvement I'd think worth discussing.

At some point, maybe with Dart 2.0 or a little earlier, Dart is now supporting multiline strings with "proper" identation (tried, but I can't find the according docs at the moment. probably due to the rather large changes related to dart 2.0 and outdated docs.)

What I have in mind is probably best described with an Example:

print(""" I am a multiline String. """)

the closing quote defines the "margin indentation" - so in this example all lines would get reduces by their leading 4 spaces, resulting in a "clean" and unintended string.

anyways, if dart or not, doesn't matter - I like the Idea and I think python3.x could benefit from it. If that's possible at all :)

I could also imagine that this "indentation cleanup" only is applied if the last quotes are on their own line? Might be too complicated though, I can't estimated or understand this...

thx for reading, Marius

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

-- Ryan (ライアン) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else https://refi64.com/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Terry Reedy

12:47 p.m.

On 3/31/2018 2:14 PM, Marius Räsener wrote:

...

Oh, ok... yeah didn‘t think of that. Except I guess I‘d assume that so far multiline strings are either with textwrap or ‚don‘t care‘? Maybe?

For docstrings, I don't care, as a docstring consumer like help() can reformat the docstring with indents and dedents. For instance

...

...
...
def f(): def g(): """returnx

more doc """ print( g.__doc__) help(g)

...

...
...
f() returnx

more doc Help on function g in module __main__: g() returnx more doc For other situations, parse-time string concatenation often suffices, as I showed in my response to the original post. This example from idlelib.config shows the increased flexibility it allows. It has 1-line padding above and 1-space padding to the left to look better when displayed in a popup box. warning = ('\n Warning: config.py - IdleConf.GetOption -\n' ' problem retrieving configuration option %r\n' ' from section %r.\n' ' returning default value: %r' % (option, section, default)) With no padding, I would not argue with someone who prefers textwrap.dedent, but dedent cannot add the leading space. For literals with really long lines, where the physical indent would push line lengths over 80, I remove physical indents. class TestClass(unittest.TestCase): \ test_outputter(self): expected = '''\ First line of a really, really, ............................, long line. Short line. Summary line that utilizes most of the room alloted, with no waste. ''' self.assertEqual(outputter('test'), expected) -- Terry Jan Reedy

francismb

April 2018

5:29 a.m.

Hi, On 03/31/2018 09:47 PM, Terry Reedy wrote:

...

With no padding, I would not argue with someone who prefers textwrap.dedent, but dedent cannot add the leading space.

Couldn't one use use the 'indent' method on the 'textwrap' module for that purpose? Thanks, --francis

Marius Räsener

March 2018

10 a.m.

Hey Robert Not really, I don‘t think another string literal would be nice. Also, to correct your example, it would have to look something like: print(d“““ I am a Line “““) The Idea is to use the closing quotes to detect the indentation length so to speak... 2018-03-31 17:48 GMT+02:00 Robert Vanden Eynde <robertve92@gmail.com>:

...

So yes, currently you just do :

import textwrap

print(textwrap.dedent(""" I am A Line """))

So you'd want a string litteral ?

print(d""" I am A Line """)

Le sam. 31 mars 2018 à 17:06, Ryan Gonzalez <rymg19@gmail.com> a écrit :

...
I have to admit, regardless of how practical this is, it would surely get rid of a ton of textwrap.dedent calls all over the place...

On March 31, 2018 9:50:43 AM Marius Räsener <m.raesener@gmail.com> wrote:

...
Hey List,

this is my very first approach to suggest a Python improvement I'd think worth discussing.

At some point, maybe with Dart 2.0 or a little earlier, Dart is now supporting multiline strings with "proper" identation (tried, but I can't find the according docs at the moment. probably due to the rather large changes related to dart 2.0 and outdated docs.)

What I have in mind is probably best described with an Example:

print(""" I am a multiline String. """)

the closing quote defines the "margin indentation" - so in this example all lines would get reduces by their leading 4 spaces, resulting in a "clean" and unintended string.

anyways, if dart or not, doesn't matter - I like the Idea and I think python3.x could benefit from it. If that's possible at all :)

I could also imagine that this "indentation cleanup" only is applied if the last quotes are on their own line? Might be too complicated though, I can't estimated or understand this...

thx for reading, Marius

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

-- Ryan (ライアン) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else https://refi64.com/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Terry Reedy

11:50 a.m.

On 3/31/2018 10:50 AM, Marius Räsener wrote:

...

What I have in mind is probably best described with an Example:

print(""" I am a multiline String. """)

the closing quote defines the "margin indentation" - so in this example all lines would get reduces by their leading 4 spaces, resulting in a "clean" and unintended string.

Adding additional default processing to multiline strings is not possible within back-compatibility constraints. It is also not necessary. The current print("I am a\n" "multiline\n" "String.\n") does exactly the same thing as the proposal 2 fewer lines and is more flexible, as one can add an initial \n or an extra \n in the middle or omit the final \n. (For the example, print("I am a\nmultiline\nString\n", also works in 1 line, but does not represent the general case of multiples lone lines.) --- In 3.6, we introduced a new prefix, 'f', so there was no back compatibility issue. There was, however, a combinatorial explosion issue, as 'F' was also added (a mistake, I now think), and no order requirement (possibly another mistake). Hence stringprefix ::= "r" | "u" | "R" | "U" grew to stringprefix ::= "r" | "u" | "R" | "U" | "f" | "F" | "fr" | "Fr" | "fR" | "FR" | "rf" | "rF" | "Rf" | "RF" New unordered 'd' and 'D' prefixes, for 'dedent', applied to multiline strings only, would multiply the number of alternatives by about 5 and would require another rewrite of all code (Python or not) that parses Python code (such as in syntax colorizers). -- Terry Jan Reedy

Rob Cliffe

6:20 p.m.

...

In 3.6, we introduced a new prefix, 'f', so there was no back compatibility issue. There was, however, a combinatorial explosion issue, as 'F' was also added (a mistake, I now think), and no order requirement (possibly another mistake). Hence

stringprefix ::= "r" | "u" | "R" | "U"

grew to

stringprefix ::= "r" | "u" | "R" | "U" | "f" | "F" | "fr" | "Fr" | "fR" | "FR" | "rf" | "rF" | "Rf" | "RF"

New unordered 'd' and 'D' prefixes, for 'dedent', applied to multiline strings only, would multiply the number of alternatives by about 5 and would require another rewrite of all code (Python or not) that parses Python code (such as in syntax colorizers).

I think you're exaggerating the difficulty somewhat. Multiplying the number of alternatives by 5 is not the same thing as increasing the complexity of code to parse it by 5. A new string prefix, 'd' say, would seem to be the best way of meeting the OP's requirement. That said, I'm about +0 on such a proposal, given that there are already reasonable ways of doing it. Regards Rob Cliffe

Steven D'Aprano

6:48 p.m.

On Sun, Apr 01, 2018 at 02:20:16AM +0100, Rob Cliffe via Python-ideas wrote:

...

...
New unordered 'd' and 'D' prefixes, for 'dedent', applied to multiline strings only, would multiply the number of alternatives by about 5 and would require another rewrite of all code (Python or not) that parses Python code (such as in syntax colorizers).

I think you're exaggerating the difficulty somewhat. Multiplying the number of alternatives by 5 is not the same thing as increasing the complexity of code to parse it by 5.

Terry didn't say that it would increase the complexity of the code by a factor of five. He said it would multiply the number of alternatives by "about 5". There would be a significant increase in the complexity of the code too, but I wouldn't want to guess how much. Starting with r and f prefixes, in both upper and lower case, we have: 4 single letter prefixes (plus 2 more, u and U, that don't combine with others) 8 double letter prefixes making 14 in total. Adding one more prefix, d|D, increases it to: 6 single letter prefixes (plus 2 more, u and U) 24 double letter prefixes 48 triple letter prefixes making 80 prefixes in total. Terry actually underestimated the explosion in prefixes: it is closer to six times more than five (but who is counting? apart from me *wink*) [Aside: if we add a fourth, the total becomes 634 prefixes.] -- Steve

Marius Räsener

April 2018

12:39 a.m.

Ok I see this is nothing for any 3.x release. I imagine this now either ‚clean‘ for users with compatibility break or just leave things as they are. So, if at all, maybe something for Python 4 :) Coincidence I watched yesterday Armin Ronachers talk related to seeing compatibility as the holy cow - interesting watch... https://www.youtube.com/watch?v=xkcNoqHgNs8&feature=youtu.be&t=2890 Steven D'Aprano <steve@pearwood.info> schrieb am So. 1. Apr. 2018 um 03:49:

...

On Sun, Apr 01, 2018 at 02:20:16AM +0100, Rob Cliffe via Python-ideas wrote:

...
...
New unordered 'd' and 'D' prefixes, for 'dedent', applied to multiline strings only, would multiply the number of alternatives by about 5 and would require another rewrite of all code (Python or not) that parses Python code (such as in syntax colorizers).

I think you're exaggerating the difficulty somewhat. Multiplying the number of alternatives by 5 is not the same thing as increasing the complexity of code to parse it by 5.

Terry didn't say that it would increase the complexity of the code by a factor of five. He said it would multiply the number of alternatives by "about 5". There would be a significant increase in the complexity of the code too, but I wouldn't want to guess how much.

Starting with r and f prefixes, in both upper and lower case, we have:

4 single letter prefixes (plus 2 more, u and U, that don't combine with others) 8 double letter prefixes

making 14 in total. Adding one more prefix, d|D, increases it to:

6 single letter prefixes (plus 2 more, u and U) 24 double letter prefixes 48 triple letter prefixes

making 80 prefixes in total. Terry actually underestimated the explosion in prefixes: it is closer to six times more than five (but who is counting? apart from me *wink*)

[Aside: if we add a fourth, the total becomes 634 prefixes.]

-- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Steven D'Aprano

1:53 a.m.

On Sun, Apr 01, 2018 at 07:39:31AM +0000, Marius Räsener wrote:

...

Ok I see this is nothing for any 3.x release. [...] So, if at all, maybe something for Python 4 :)

No, that's the wrong conclusion to draw. There are four options: (1) Change the behaviour of triple-quoted strings, immediately as of 3.8. This is out. It will be out in 3.9, and 4.0. (2) Change the behaviour of triple-quoted strings using a warning period and a __future__ import. This would probably take a minimum of three releases, but it could start in 3.8. However, anyone arguing in favour of this would have to make a VERY good case for it. (3) Leave the behaviour of triple-quoted strings alone, but introduce new behaviour via a method, or a new prefix. Again, this could start as early as 3.8 if someone makes a strong case for it. (4) The status quo: nothing changes. Python 4 will not be special like Python 3 was. Any new features in Python 4 that break backwards compatibility will still be required to go through a transition period, involving warnings and/or __future__ imports. Python will possibly never again go through a major break like Python 2 to 3, but if it does, it may not until Python 5 or 6. So if you think that waiting a few years means we will be free to make this change, no, option (1) will still be out, even in Python 4. Personally, I find the situation with triple-quoted strings and indentation to be a regular low-level annoyance, and I'd like to see a nice solution sooner rather than later. Thank you for raising this issue again, even if nothing comes from it. -- Steve

Eric V. Smith

2:11 a.m.

On 3/31/2018 9:48 PM, Steven D'Aprano wrote:

...

On Sun, Apr 01, 2018 at 02:20:16AM +0100, Rob Cliffe via Python-ideas wrote:

...
...
New unordered 'd' and 'D' prefixes, for 'dedent', applied to multiline strings only, would multiply the number of alternatives by about 5 and would require another rewrite of all code (Python or not) that parses Python code (such as in syntax colorizers).

I think you're exaggerating the difficulty somewhat. Multiplying the number of alternatives by 5 is not the same thing as increasing the complexity of code to parse it by 5.

Terry didn't say that it would increase the complexity of the code by a factor of five. He said it would multiply the number of alternatives by "about 5". There would be a significant increase in the complexity of the code too, but I wouldn't want to guess how much.

Starting with r and f prefixes, in both upper and lower case, we have:

4 single letter prefixes (plus 2 more, u and U, that don't combine with others) 8 double letter prefixes

making 14 in total. Adding one more prefix, d|D, increases it to:

6 single letter prefixes (plus 2 more, u and U) 24 double letter prefixes 48 triple letter prefixes

making 80 prefixes in total. Terry actually underestimated the explosion in prefixes: it is closer to six times more than five (but who is counting? apart from me *wink*)

[Aside: if we add a fourth, the total becomes 634 prefixes.]

Not that it really matters, but there's some code I use whenever I feel like playing with adding string prefixes. It usually encourages me to not do that! Lib/tokenize.py:_all_string_prefixes exists just for calculating string prefixes. Since it's not what is actually used by the tokenizer, I don't claim it's perfect (but I don't know of any errors in it). According to it, and ignoring the empty string, there are currently 24 prefixes: {'B', 'BR', 'Br', 'F', 'FR', 'Fr', 'R', 'RB', 'RF', 'Rb', 'Rf', 'U', 'b', 'bR', 'br', 'f', 'fR', 'fr', 'r', 'rB', 'rF', 'rb', 'rf', 'u'} And if you add 'd', and it can't combine with 'b' or 'u', I count 90: {'rdf', 'FR', 'dRF', 'rD', 'FrD', 'DFr', 'frd', 'RDf', 'u', 'DF', 'd', 'Frd', 'frD', 'dFr', 'rDF', 'fD', 'rB', 'dFR', 'FD', 'dr', 'Fr', 'DfR', 'fdR', 'Rb', 'dfr', 'rdF', 'rf', 'Drf', 'R', 'RB', 'BR', 'FdR', 'bR', 'DFR', 'RdF', 'dF', 'F', 'fd', 'Br', 'Dfr', 'Dr', 'r', 'rfd', 'RFd', 'Fdr', 'dfR', 'rb', 'fDr', 'rFD', 'fRd', 'Rfd', 'RDF', 'rFd', 'Rdf', 'rF', 'FDr', 'drF', 'dR', 'D', 'br', 'fr', 'drf', 'DrF', 'rd', 'DRF', 'DR', 'RFD', 'Rf', 'fR', 'RfD', 'Df', 'rDf', 'U', 'f', 'df', 'DRf', 'fdr', 'B', 'FRD', 'RF', 'Fd', 'Rd', 'fRD', 'FRd', 'b', 'dRf', 'FDR', 'RD', 'fDR', 'rfD'} I guess it's debatable if you want to count prefixes that contain 'b' as string prefixes or not, but the tokenizer thinks they are. If you leave them out, you come up with the 14 and 80 that Steven mentions. I agree with Terry that adding 'F' was a mistake. But since the upper case versions of 'r', and 'b' already existed, it was included. Interestingly, in 2.7 'ur' is a valid prefix, but not in 3.6. I don't recall if that was deliberate or not. And 'ru' isn't valid in either version. Eric

Chris Angelico

2:24 a.m.

On Sun, Apr 1, 2018 at 7:11 PM, Eric V. Smith <eric@trueblade.com> wrote:

...

Interestingly, in 2.7 'ur' is a valid prefix, but not in 3.6. I don't recall if that was deliberate or not. And 'ru' isn't valid in either version.

I believe it was. The 'ur' string literal in Py2 was a bizarre hybrid of raw-but-allowing-Unicode-escapes, which makes no sense in the Py3 world. $ python3 Python 3.8.0a0 (heads/literal_eval-exception:ddcb2eb331, Feb 21 2018, 04:32:23) [GCC 6.3.0 20170516] on linux Type "help", "copyright", "credits" or "license" for more information.

...

...
...
print(u"\\ \u005c \\") \ \ \ print(r"\\ \u005c \\") \\ \u005c \\

$ python2 Python 2.7.13 (default, Nov 24 2017, 17:33:09) [GCC 6.3.0 20170516] on linux2 Type "help", "copyright", "credits" or "license" for more information.

...

...
...
print(u"\\ \u005c \\") \ \ \ print(r"\\ \u005c \\") \\ \u005c \\ print(ur"\\ \u005c \\") \\ \ \\

In Py3, a normal Unicode literal (with or without the 'u' prefix, which has no meaning) will interpret "\u005c" as a backslash. A raw literal will treat it as a backslash followed by five other characters. In Py2, the same semantics hold for normal Unicode literals, and for bytes literals, the "\u" escape code has no meaning (and is parsed as "\\u"). But in a raw Unicode literal, the backslashes are treated literally... unless they're starting a "\u" sequence, in which case they're parsed. So if you use a raw string literal in Python 3 to store a Windows path name, you're fine as long as it doesn't end with a backslash (a wart in the design that probably can't be done any other way). But in Py2, a raw Unicode literal will occasionally misbehave in the *exact* *same* *way* as a non-raw string literal will - complete with it being data-dependent. Since the entire point of the Py3 u"..." prefix is compatibility with Py2, the semantics have to be retained. There's no point supporting ur"..." in Py3 if it's not going to produce the same result as in Py2. ChrisA

Nick Coghlan

2:31 a.m.

On 1 April 2018 at 19:24, Chris Angelico <rosuav@gmail.com> wrote:

...

Since the entire point of the Py3 u"..." prefix is compatibility with Py2, the semantics have to be retained. There's no point supporting ur"..." in Py3 if it's not going to produce the same result as in Py2.

Right, "ur" strings were originally taken out in Python 3.0, and then we made the decision *not* to add them back when PEP 414 restored other uses of the "u" prefix: https://www.python.org/dev/peps/pep-0414/#exclusion-of-raw-unicode-literals Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Richard Damon

5:08 a.m.

...

On 3/31/2018 9:48 PM, Steven D'Aprano wrote:

...
On Sun, Apr 01, 2018 at 02:20:16AM +0100, Rob Cliffe via Python-ideas wrote:

...
...
New unordered 'd' and 'D' prefixes, for 'dedent', applied to multiline strings only, would multiply the number of alternatives by about 5 and would require another rewrite of all code (Python or not) that parses Python code (such as in syntax colorizers).

I think you're exaggerating the difficulty somewhat. Multiplying the number of alternatives by 5 is not the same thing as increasing the complexity of code to parse it by 5.

Terry didn't say that it would increase the complexity of the code by a factor of five. He said it would multiply the number of alternatives by "about 5". There would be a significant increase in the complexity of the code too, but I wouldn't want to guess how much.

Starting with r and f prefixes, in both upper and lower case, we have:

4 single letter prefixes (plus 2 more, u and U, that don't combine with others) 8 double letter prefixes

making 14 in total. Adding one more prefix, d|D, increases it to:

6 single letter prefixes (plus 2 more, u and U) 24 double letter prefixes 48 triple letter prefixes

making 80 prefixes in total. Terry actually underestimated the explosion in prefixes: it is closer to six times more than five (but who is counting? apart from me *wink*)

[Aside: if we add a fourth, the total becomes 634 prefixes.]

Not that it really matters, but there's some code I use whenever I feel like playing with adding string prefixes. It usually encourages me to not do that!

Lib/tokenize.py:_all_string_prefixes exists just for calculating string prefixes. Since it's not what is actually used by the tokenizer, I don't claim it's perfect (but I don't know of any errors in it).

According to it, and ignoring the empty string, there are currently 24 prefixes: {'B', 'BR', 'Br', 'F', 'FR', 'Fr', 'R', 'RB', 'RF', 'Rb', 'Rf', 'U', 'b', 'bR', 'br', 'f', 'fR', 'fr', 'r', 'rB', 'rF', 'rb', 'rf', 'u'}

And if you add 'd', and it can't combine with 'b' or 'u', I count 90: {'rdf', 'FR', 'dRF', 'rD', 'FrD', 'DFr', 'frd', 'RDf', 'u', 'DF', 'd', 'Frd', 'frD', 'dFr', 'rDF', 'fD', 'rB', 'dFR', 'FD', 'dr', 'Fr', 'DfR', 'fdR', 'Rb', 'dfr', 'rdF', 'rf', 'Drf', 'R', 'RB', 'BR', 'FdR', 'bR', 'DFR', 'RdF', 'dF', 'F', 'fd', 'Br', 'Dfr', 'Dr', 'r', 'rfd', 'RFd', 'Fdr', 'dfR', 'rb', 'fDr', 'rFD', 'fRd', 'Rfd', 'RDF', 'rFd', 'Rdf', 'rF', 'FDr', 'drF', 'dR', 'D', 'br', 'fr', 'drf', 'DrF', 'rd', 'DRF', 'DR', 'RFD', 'Rf', 'fR', 'RfD', 'Df', 'rDf', 'U', 'f', 'df', 'DRf', 'fdr', 'B', 'FRD', 'RF', 'Fd', 'Rd', 'fRD', 'FRd', 'b', 'dRf', 'FDR', 'RD', 'fDR', 'rfD'}

I guess it's debatable if you want to count prefixes that contain 'b' as string prefixes or not, but the tokenizer thinks they are. If you leave them out, you come up with the 14 and 80 that Steven mentions.

I agree with Terry that adding 'F' was a mistake. But since the upper case versions of 'r', and 'b' already existed, it was included.

Interestingly, in 2.7 'ur' is a valid prefix, but not in 3.6. I don't recall if that was deliberate or not. And 'ru' isn't valid in either version.

Eric One comment about the 'combitorial explosion' is that it sort of assumes

On 4/1/18 5:11 AM, Eric V. Smith wrote: that each individual combination case needs to be handled with distinct code. My guess is that virtually all of the actual implementation of these prefixes can be handled by setting a flag for the presence of that prefix, and at the parsing of each character you need to just check a flag or two to figure out how to process it. You might get a bit more complication in determining if a given combination is valid, but if that gets too complicated it is likely an indication of an inconstancy in the language definition. -- Richard Damon

Steven D'Aprano

5:36 a.m.

On Sun, Apr 01, 2018 at 08:08:41AM -0400, Richard Damon wrote:

...

One comment about the 'combitorial explosion' is that it sort of assumes that each individual combination case needs to be handled with distinct code.

No -- as I said in an earlier post, Terry and I (and Eric) are talking about the explosion in number of prefixes, not the complexity of the code. You are right that many of the prefixes can be handled by the same code: rfd rfD rFd rFD rdf rdF rDf rDF Rfd RfD RFd RFD Rdf RdF RDf RDF frd frD fRd fRD fdr fdR fDr fDR Frd FrD FRd FRD Fdr FdR FDr FDR drf drF dRf dRF dfr dfR dFr dFR Drf DrF DRf DRF Dfr DfR DFr DFR # why did we support all these combinations? who uses them? presumably will all handled by the same "raw dedent f-string" code. But the parser still has to handle all those cases, and so does the person reading the code. And that *mental complexity* is (in my opinion) the biggest issue with adding a new d-prefix, and why I would rather make it a method. Another big advantage of a method is that we can apply it to non-literals too. The number of code paths increases too, but not anywhere as fast: # existing - regular ("cooked") triple-quoted string; - raw string; - f-string - raw f-string # proposed additions - dedent string - raw dedent string - dedent f-string - raw dedent f-string so roughly doubling the number of cases. I doubt that will double the code complexity, but it will complicate it somewhat. Apart from parsing, the actual complexity to the code will probably be similar whether it is a method or a prefix. After all, whichever we do, we still need built-in dedent code. -- Steve

Chris Angelico

5:55 a.m.

On Sun, Apr 1, 2018 at 10:36 PM, Steven D'Aprano <steve@pearwood.info> wrote:

...

And that *mental complexity* is (in my opinion) the biggest issue with adding a new d-prefix, and why I would rather make it a method.

Another big advantage of a method is that we can apply it to non-literals too.

I'd like to expand on this a bit more. Current string prefix letters are: * u/b: completely change the object you're creating * f: change it from a literal to a kind of expression * r: change the interpretation of backslashes * triple quotes: change the interpretation of newlines All of these are significant *to the parser*. You absolutely cannot do any of these with methods (well, maybe you could have u/b done by having a literal for one of them, and the other is an encode or decode operation, but that's a pretty poor hack). But dedenting a string doesn't change the way the source code is interpreted. So it's free to be a method - which is far easier to add to the language. All you need is ".dedent()" to be syntactically parsed as a method call (which it already is), and every tool that processes Python code will correctly interpret this. So here's what, IMO, Marius can push for: 1) A method on Unicode strings which does the same as textwrap.dedent() 2) A peephole optimization wherein certain methods on literals get executed at compile time. The latter optimization would also apply to cases such as " spam ".strip() - as long as all it does is return another constant value, it can be done at compile time. Semantically, though, the part that matters is simply the new method. (Sadly, this can't be applied to Decimal("1.234"), as that's not a method and could be shadowed/mocked.) While I wouldn't use that method much myself, I think it's a Good Thing for features like that to be methods rather than functions stashed away in a module. (How do you know to look in "textwrap" for a line-by-line version of x.strip() ??) So I would be +1 on both the enhancements I mentioned above, and a solid -1 on this becoming a new form of literal. ChrisA

Eric V. Smith

12:07 p.m.

On 4/1/2018 8:55 AM, Chris Angelico wrote:

...

On Sun, Apr 1, 2018 at 10:36 PM, Steven D'Aprano <steve@pearwood.info> wrote:

...
And that *mental complexity* is (in my opinion) the biggest issue with adding a new d-prefix, and why I would rather make it a method.

Another big advantage of a method is that we can apply it to non-literals too.

I'd like to expand on this a bit more.

Current string prefix letters are:

* u/b: completely change the object you're creating * f: change it from a literal to a kind of expression * r: change the interpretation of backslashes * triple quotes: change the interpretation of newlines

All of these are significant *to the parser*. You absolutely cannot do any of these with methods (well, maybe you could have u/b done by having a literal for one of them, and the other is an encode or decode operation, but that's a pretty poor hack).

The one place where a dedented string would come in handy, and where it would need to be recognized by the parser (and could not be the result of a function or method) is a docstring. Well, I guess you could have the parser "know" about certain string methods, but that seems horrible. Eric

...

But dedenting a string doesn't change the way the source code is interpreted. So it's free to be a method - which is far easier to add to the language. All you need is ".dedent()" to be syntactically parsed as a method call (which it already is), and every tool that processes Python code will correctly interpret this.

So here's what, IMO, Marius can push for:

1) A method on Unicode strings which does the same as textwrap.dedent() 2) A peephole optimization wherein certain methods on literals get executed at compile time.

The latter optimization would also apply to cases such as " spam ".strip() - as long as all it does is return another constant value, it can be done at compile time. Semantically, though, the part that matters is simply the new method. (Sadly, this can't be applied to Decimal("1.234"), as that's not a method and could be shadowed/mocked.)

While I wouldn't use that method much myself, I think it's a Good Thing for features like that to be methods rather than functions stashed away in a module. (How do you know to look in "textwrap" for a line-by-line version of x.strip() ??) So I would be +1 on both the enhancements I mentioned above, and a solid -1 on this becoming a new form of literal.

ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Terry Reedy

12:10 p.m.

On 4/1/2018 8:36 AM, Steven D'Aprano wrote:

...

On Sun, Apr 01, 2018 at 08:08:41AM -0400, Richard Damon wrote:

...
One comment about the 'combinatorial explosion' is that it sort of assumes that each individual combination case needs to be handled with distinct code.

...

No -- as I said in an earlier post, Terry and I (and Eric) are talking about the explosion in number of prefixes, not the complexity of the code.

You are right that many of the prefixes can be handled by the same code:

rfd rfD rFd rFD rdf rdF rDf rDF Rfd RfD RFd RFD Rdf RdF RDf RDF frd frD fRd fRD fdr fdR fDr fDR Frd FrD FRd FRD Fdr FdR FDr FDR drf drF dRf dRF dfr dfR dFr dFR Drf DrF DRf DRF Dfr DfR DFr DFR # why did we support all these combinations? who uses them?

presumably will all handled by the same "raw dedent f-string" code. But the parser still has to handle all those cases, and so does the person reading the code.

IDLE's colorizer does its parsing with a giant regex. The new prefix combinations would nearly double the number of alternatives in the regex. I am sure that this would mean more nodes in the compiled finite-state machine. Even though the non-re code of the colorizer would not change, I am pretty sure that this would mean that coloring takes longer. Since the colorizer is called with each keystroke*, and since other events can be handled between keystrokes#, colorizing time *could* become an issue, especially on older or slower machines than mine. Noticeable delays between keystroke and character appearance on screen are a real drag. * Type 'i', 'i' appears 'normal'; type 'n', 'in' is colored 'keyword'; type 't', 'int' is colored 'builtin'; type 'o', 'into' becomes 'normal' again. # One can edit while a program is running in a separate process and outputting to the shell window. -- Terry Jan Reedy

Marius Räsener

12:25 p.m.

Hey again, Thx all for the active discussion. Since I‘m the OP and though want to make clear that I didn‘t had a `d` string literal in mind. So the Idea was to support this just as default, with any more effords to support it I don‘t see a real advantage or that I‘d think it is ‚solved‘. So I‘m aware that probably there won‘t be a majority to have this considering a breaking change - still I want to emphasize that I wouldn‘t want yet another string literal. I think this would be really bad. Actually I‘d rather like to see Python develop backwards and remove string literals and not getting even more ... so maybe just `r` and `b`? Anyways, I think I‘ve made my Point clear. Terry Reedy <tjreedy@udel.edu> schrieb am So. 1. Apr. 2018 um 21:12:

...

On 4/1/2018 8:36 AM, Steven D'Aprano wrote:

...
On Sun, Apr 01, 2018 at 08:08:41AM -0400, Richard Damon wrote:

...
One comment about the 'combinatorial explosion' is that it sort of assumes that each individual combination case needs to be handled with distinct code.

...
No -- as I said in an earlier post, Terry and I (and Eric) are talking about the explosion in number of prefixes, not the complexity of the code.

You are right that many of the prefixes can be handled by the same code:

rfd rfD rFd rFD rdf rdF rDf rDF Rfd RfD RFd RFD Rdf RdF RDf RDF frd frD fRd fRD fdr fdR fDr fDR Frd FrD FRd FRD Fdr FdR FDr FDR drf drF dRf dRF dfr dfR dFr dFR Drf DrF DRf DRF Dfr DfR DFr DFR # why did we support all these combinations? who uses them?

presumably will all handled by the same "raw dedent f-string" code. But the parser still has to handle all those cases, and so does the person reading the code.

IDLE's colorizer does its parsing with a giant regex. The new prefix combinations would nearly double the number of alternatives in the regex. I am sure that this would mean more nodes in the compiled finite-state machine. Even though the non-re code of the colorizer would not change, I am pretty sure that this would mean that coloring takes longer. Since the colorizer is called with each keystroke*, and since other events can be handled between keystrokes#, colorizing time *could* become an issue, especially on older or slower machines than mine. Noticeable delays between keystroke and character appearance on screen are a real drag.

* Type 'i', 'i' appears 'normal'; type 'n', 'in' is colored 'keyword'; type 't', 'int' is colored 'builtin'; type 'o', 'into' becomes 'normal' again.

# One can edit while a program is running in a separate process and outputting to the shell window.

-- Terry Jan Reedy

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Chris Angelico

12:57 p.m.

On Mon, Apr 2, 2018 at 5:25 AM, Marius Räsener <m.raesener@gmail.com> wrote:

...

Actually I‘d rather like to see Python develop backwards and remove string literals and not getting even more ... so maybe just `r` and `b`?

Yeah, that's not gonna happen :) ChrisA

MRAB

6:45 p.m.

On 2018-04-01 20:10, Terry Reedy wrote:

...

On 4/1/2018 8:36 AM, Steven D'Aprano wrote:

...
On Sun, Apr 01, 2018 at 08:08:41AM -0400, Richard Damon wrote:

...
One comment about the 'combinatorial explosion' is that it sort of assumes that each individual combination case needs to be handled with distinct code.

...
No -- as I said in an earlier post, Terry and I (and Eric) are talking about the explosion in number of prefixes, not the complexity of the code.

You are right that many of the prefixes can be handled by the same code:

rfd rfD rFd rFD rdf rdF rDf rDF Rfd RfD RFd RFD Rdf RdF RDf RDF frd frD fRd fRD fdr fdR fDr fDR Frd FrD FRd FRD Fdr FdR FDr FDR drf drF dRf dRF dfr dfR dFr dFR Drf DrF DRf DRF Dfr DfR DFr DFR # why did we support all these combinations? who uses them?

presumably will all handled by the same "raw dedent f-string" code. But the parser still has to handle all those cases, and so does the person reading the code.

IDLE's colorizer does its parsing with a giant regex. The new prefix combinations would nearly double the number of alternatives in the regex. I am sure that this would mean more nodes in the compiled finite-state machine. Even though the non-re code of the colorizer would not change, I am pretty sure that this would mean that coloring takes longer. Since the colorizer is called with each keystroke*, and since other events can be handled between keystrokes#, colorizing time *could* become an issue, especially on older or slower machines than mine. Noticeable delays between keystroke and character appearance on screen are a real drag.

In Python 3.7 that part is now: stringprefix = r"(?i:\br|u|f|fr|rf|b|br|rb)?" (which looks slightly wrong to me!).

Richard Damon

12:42 p.m.

On 4/1/18 8:36 AM, Steven D'Aprano wrote:

...

On Sun, Apr 01, 2018 at 08:08:41AM -0400, Richard Damon wrote:

...
One comment about the 'combitorial explosion' is that it sort of assumes that each individual combination case needs to be handled with distinct code. No -- as I said in an earlier post, Terry and I (and Eric) are talking about the explosion in number of prefixes, not the complexity of the code.

You are right that many of the prefixes can be handled by the same code:

rfd rfD rFd rFD rdf rdF rDf rDF Rfd RfD RFd RFD Rdf RdF RDf RDF frd frD fRd fRD fdr fdR fDr fDR Frd FrD FRd FRD Fdr FdR FDr FDR drf drF dRf dRF dfr dfR dFr dFR Drf DrF DRf DRF Dfr DfR DFr DFR # why did we support all these combinations? who uses them?

presumably will all handled by the same "raw dedent f-string" code. But the parser still has to handle all those cases, and so does the person reading the code.

And that *mental complexity* is (in my opinion) the biggest issue with adding a new d-prefix, and why I would rather make it a method.

Another big advantage of a method is that we can apply it to non-literals too.

The number of code paths increases too, but not anywhere as fast:

# existing - regular ("cooked") triple-quoted string; - raw string; - f-string - raw f-string

# proposed additions - dedent string - raw dedent string - dedent f-string - raw dedent f-string

so roughly doubling the number of cases. I doubt that will double the code complexity, but it will complicate it somewhat.

Apart from parsing, the actual complexity to the code will probably be similar whether it is a method or a prefix. After all, whichever we do, we still need built-in dedent code.

I think you miss my point that we shouldn't be parsing by each combination of prefixes (even collapsing equivalent ones), and instead by each prefix adjusting the rules for parse, which a single parsing routine uses. Mentally, you should be doing the same. I think that the grammar trying to exhaustively list the prefixes is awkward, and would be better served by a simpler production that allows for an arbitrary combination of the prefixes combined with a rule properly limiting the combinations of letters allowed, something like: one one of a give letter (case insensitive), at most one of b, u, and f, at most one of r and u (for python 3), then followed like currently with a description of what each letter does. This removes the combitorial explosion that is already starting with the addition of f. -- Richard Damon

Brendan Barnwell

1:31 p.m.

On 2018-04-01 05:36, Steven D'Aprano wrote:

...

On Sun, Apr 01, 2018 at 08:08:41AM -0400, Richard Damon wrote:

...
One comment about the 'combitorial explosion' is that it sort of assumes that each individual combination case needs to be handled with distinct code.

No -- as I said in an earlier post, Terry and I (and Eric) are talking about the explosion in number of prefixes, not the complexity of the code.

You are right that many of the prefixes can be handled by the same code:

rfd rfD rFd rFD rdf rdF rDf rDF Rfd RfD RFd RFD Rdf RdF RDf RDF frd frD fRd fRD fdr fdR fDr fDR Frd FrD FRd FRD Fdr FdR FDr FDR drf drF dRf dRF dfr dfR dFr dFR Drf DrF DRf DRF Dfr DfR DFr DFR # why did we support all these combinations? who uses them?

presumably will all handled by the same "raw dedent f-string" code. But the parser still has to handle all those cases, and so does the person reading the code.

And that *mental complexity* is (in my opinion) the biggest issue with adding a new d-prefix, and why I would rather make it a method.

That doesn't seem a very reasonable argument to me. That is like saying that a person reading code has to mentally slog through the cognitive burden of understanding "all the combinations" of "a + b + c", "a + b - c", "a * b + c", "a - b * c", etc. We don't. We know what the operators mean and we build up our understanding of expressions by combining them. Similarly, these string prefixes can mostly be thought of as indepedent flags. You don't parse each combination separately; you learn what each flag means and then build up your understanding of a prefix by combining your understanding of the flags. (This is also glossing over the fact that many of the combinations you list differ only in case, which to my mind adds no extra cognitive load whatsoever.) -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

Richard Damon

1:57 p.m.

On 4/1/18 4:31 PM, Brendan Barnwell wrote:

...

On 2018-04-01 05:36, Steven D'Aprano wrote:

...
On Sun, Apr 01, 2018 at 08:08:41AM -0400, Richard Damon wrote:

...
One comment about the 'combitorial explosion' is that it sort of assumes that each individual combination case needs to be handled with distinct code.

No -- as I said in an earlier post, Terry and I (and Eric) are talking about the explosion in number of prefixes, not the complexity of the code.

You are right that many of the prefixes can be handled by the same code:

rfd rfD rFd rFD rdf rdF rDf rDF Rfd RfD RFd RFD Rdf RdF RDf RDF frd frD fRd fRD fdr fdR fDr fDR Frd FrD FRd FRD Fdr FdR FDr FDR drf drF dRf dRF dfr dfR dFr dFR Drf DrF DRf DRF Dfr DfR DFr DFR # why did we support all these combinations? who uses them?

presumably will all handled by the same "raw dedent f-string" code. But the parser still has to handle all those cases, and so does the person reading the code.

And that *mental complexity* is (in my opinion) the biggest issue with adding a new d-prefix, and why I would rather make it a method.

That doesn't seem a very reasonable argument to me. That is like saying that a person reading code has to mentally slog through the cognitive burden of understanding "all the combinations" of "a + b + c", "a + b - c", "a * b + c", "a - b * c", etc. We don't. We know what the operators mean and we build up our understanding of expressions by combining them. Similarly, these string prefixes can mostly be thought of as indepedent flags. You don't parse each combination separately; you learn what each flag means and then build up your understanding of a prefix by combining your understanding of the flags. (This is also glossing over the fact that many of the combinations you list differ only in case, which to my mind adds no extra cognitive load whatsoever.)

Actually ALL the variation listed were the exact same prefix (dfr) with the 6 variations in possible order and the 8 variation of each of those in case. Which just shows why you don't want to try to exhaustive list prefixes. -- Richard Damon

Mike Miller

10:09 p.m.

On 2018-04-01 05:36, Steven D'Aprano wrote:

...

You are right that many of the prefixes can be handled by the same code:

rfd rfD rFd rFD rdf rdF rDf rDF Rfd RfD RFd RFD Rdf RdF RDf RDF frd frD fRd fRD fdr fdR fDr fDR Frd FrD FRd FRD Fdr FdR FDr FDR drf drF dRf dRF dfr dfR dFr dFR Drf DrF DRf DRF Dfr DfR DFr DFR # why did we support all these combinations? who uses them?

In almost twenty years of using Python, I've not seen capital string prefixes in real code, ever. Sounds like a great candidate for deprecation? -Mike

Michel Desmoulin

2:53 a.m.

Le 02/04/2018 à 07:09, Mike Miller a écrit :

...

On 2018-04-01 05:36, Steven D'Aprano wrote:

...
You are right that many of the prefixes can be handled by the same code:

rfd rfD rFd rFD rdf rdF rDf rDF Rfd RfD RFd RFD Rdf RdF RDf RDF frd frD fRd fRD fdr fdR fDr fDR Frd FrD FRd FRD Fdr FdR FDr FDR drf drF dRf dRF dfr dfR dFr dFR Drf DrF DRf DRF Dfr DfR DFr DFR # why did we support all these combinations? who uses them?

In almost twenty years of using Python, I've not seen capital string prefixes in real code, ever. Sounds like a great candidate for deprecation?

+1 It's not like migrating would be hard: a replace is enough to fix the rare projects doing that. And even if they missed the warning, it's a syntax error anyway, so you will get the error as soon as you try to run the program, not at a later point at runtime. What about doing a poll, then suggests a warning on 3.8, removed in 4.0 ?

MRAB

11:40 a.m.

On 2018-04-02 10:53, Michel Desmoulin wrote:

...

Le 02/04/2018 à 07:09, Mike Miller a écrit :

...
On 2018-04-01 05:36, Steven D'Aprano wrote:

...
You are right that many of the prefixes can be handled by the same code:

rfd rfD rFd rFD rdf rdF rDf rDF Rfd RfD RFd RFD Rdf RdF RDf RDF frd frD fRd fRD fdr fdR fDr fDR Frd FrD FRd FRD Fdr FdR FDr FDR drf drF dRf dRF dfr dfR dFr dFR Drf DrF DRf DRF Dfr DfR DFr DFR # why did we support all these combinations? who uses them?

In almost twenty years of using Python, I've not seen capital string prefixes in real code, ever. Sounds like a great candidate for deprecation?

+1

It's not like migrating would be hard: a replace is enough to fix the rare projects doing that. And even if they missed the warning, it's a syntax error anyway, so you will get the error as soon as you try to run the program, not at a later point at runtime.

What about doing a poll, then suggests a warning on 3.8, removed in 4.0 ?

Also, Python is case-sensitive elsewhere, so why not here too? OTOH, it's not like it's causing a problem.

Mike Miller

1:59 p.m.

On 2018-04-02 11:40, MRAB wrote:

...

OTOH, it's not like it's causing a problem.

Well, not a big one, but there are arguments for keeping a language as simple as possible. Also every time an idea comes up for a string prefix, the combinatorial issue comes up again. If we could factor out an unnecessary 2x it might help there.

Steve Barnes

5:08 a.m.

On 01/04/2018 02:48, Steven D'Aprano wrote:

...

On Sun, Apr 01, 2018 at 02:20:16AM +0100, Rob Cliffe via Python-ideas wrote:

...
...
New unordered 'd' and 'D' prefixes, for 'dedent', applied to multiline strings only, would multiply the number of alternatives by about 5 and would require another rewrite of all code (Python or not) that parses Python code (such as in syntax colorizers).

I think you're exaggerating the difficulty somewhat. Multiplying the number of alternatives by 5 is not the same thing as increasing the complexity of code to parse it by 5.

Terry didn't say that it would increase the complexity of the code by a factor of five. He said it would multiply the number of alternatives by "about 5". There would be a significant increase in the complexity of the code too, but I wouldn't want to guess how much.

Starting with r and f prefixes, in both upper and lower case, we have:

4 single letter prefixes (plus 2 more, u and U, that don't combine with others) 8 double letter prefixes

making 14 in total. Adding one more prefix, d|D, increases it to:

6 single letter prefixes (plus 2 more, u and U) 24 double letter prefixes 48 triple letter prefixes

making 80 prefixes in total. Terry actually underestimated the explosion in prefixes: it is closer to six times more than five (but who is counting? apart from me *wink*)

[Aside: if we add a fourth, the total becomes 634 prefixes.]

Can I suggest, rather than another string prefix, that will require the user to add the d flag to every string that they use, we consider a file scope dedent_multiline or auto_dedent import, possibly from __future__ or textwrap, that automatically applies the dedent function to all multiline strings in the file. This would reflect that, typically, a specific developer tends to want either all or no multi-line text strings dedented. It should have minimumal impact on the language, operate at compile time so be low overhead and avoid cluttering strings up. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. http://www.avg.com

Steven D'Aprano

6:06 a.m.

On Mon, Apr 02, 2018 at 12:08:47PM +0000, Steve Barnes wrote:

...

This would reflect that, typically, a specific developer tends to want either all or no multi-line text strings dedented.

I don't know how you come to that conclusion. I certainly would not want "All or Nothing" when it comes to dedenting triple-quoted strings. -- Steve

Nick Coghlan

7:46 a.m.

On 2 April 2018 at 23:06, Steven D'Aprano <steve@pearwood.info> wrote:

...

On Mon, Apr 02, 2018 at 12:08:47PM +0000, Steve Barnes wrote:

...
This would reflect that, typically, a specific developer tends to want either all or no multi-line text strings dedented.

I don't know how you come to that conclusion.

I certainly would not want "All or Nothing" when it comes to dedenting triple-quoted strings.

If we did flip the default with a "from __future__ import auto_dedent" though, there would be an opportunity to consider the available approaches for *adding* indentation after the fact, such as: indented = textwrap.indent(text, " " * 8) or: indent = " " * 8 indented = "\n".join((indent + line if line else line) for line in text.splitlines()) Adding indentation is generally easier than removing it, since you can operate on each line in isolation, rather than having to work out the common prefix. To allow exact recreation of the current indented multi-line string behaviour, we could offer an `__indent__` constant, which the compiler replaced with the leading indent of the current code block (Note: not necessarily the indent level of the current line). So where today we have: * leading indent by default * "textwrap.dedent(text)" to strip the common leading whitespace In an auto-dedent world, we'd have: * the current block indent level stripped from each line after the first in multi-line strings by default * add it back by doing "textwrap.indent(text, __indent__)" in the same code block I mostly find the current behaviour irritating, and work around it by way of module level constants, but even so, I'm still not sure it qualifies as being annoying enough to be worth the hassle of changing it. One relevant point though is that passing an already dedented string through textwrap.dedent() will be a no-op, so the compatibility cases to worry about will be those where *all* of the leading whitespace in a multiline string is significant, including the common prefix arising from the code block indentation. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Steven D'Aprano

March 2018

5:43 p.m.

On Sat, Mar 31, 2018 at 04:50:03PM +0200, Marius Räsener wrote: [...]

...

What I have in mind is probably best described with an Example:

print(""" I am a multiline String. """)

the closing quote defines the "margin indentation" - so in this example all lines would get reduces by their leading 4 spaces, resulting in a "clean" and unintended string.

Backwards compatibility rules this out. I have many strings that intentionally include indents and you will annoy me no end if you make those indents disappear :-) But having said that, for every one of those, I have a lot more where the indents are annoying. I either outdent the lines: def spam(): text = """some text another line and a third one """ print(text) or I use implicit string concatenation: def spam(): text = ("some text\n" "another line\n" "and a third\n") print(text) neither of which I'm really happy with. The ideal solution would: - require only a single pair of starting/ending string delimiters; - allow string literals to be indented to the current block, for the visual look and to make it more convenient with editors which automatically indent; - evaluate without the indents; - with no runtime cost. One solution is to add yet another string prefix, let's say d for dedent, but as Terry and others point out, that leads to a combinational explosion with f-strings and r-strings already existing. Another possibility is to make dedent a string method: def spam(): text = """\ some text another line and a third """.dedent() print(text) and avoid the import of textwrap. However, that also imposes a runtime cost, which could be expensive if you are careless: for x in seq: for y in another_seq: process("""/ some large indented string """.dedent() ) (Note: the same applies to using textwrap.dedent.) But we could avoid that runtime cost if the keyhole optimizer performed the dedent at compile time: triple-quoted string literal .dedent() could be optimized at compile-time, like other constant-folding. Out of all the options, including the status quo, the one I dislike the least is the last one: - make dedent a string method; - recommend (but don't require) that implementations perform the dedent of string literals at compile time; (failure to do so is a quality of implementation issue, not a bug) - textwrap.dedent then becomes a thin wrapper around the string method. Thoughts? -- Steve

Mike Miller

February 2019

10:13 a.m.

New subject: Multi-line string indentation

Was: "Dart (Swift) like multi line strings indentation" This discussion petered-out but I liked the idea, as it alleviates something occasionally annoying. Am supportive of the d'' prefix, perhaps the capital prefixes can be deprecated to avoid issues? If not, a sometimes-optimized (or C-accelerated) str.dedent() is acceptable too. Anyone still interested in this? -Mike On 3/31/18 5:43 PM, Steven D'Aprano wrote:

...

The ideal solution would:

- require only a single pair of starting/ending string delimiters;

- allow string literals to be indented to the current block, for the visual look and to make it more convenient with editors which automatically indent;

- evaluate without the indents;

- with no runtime cost.

One solution is to add yet another string prefix, let's say d for dedent, but as Terry and others point out, that leads to a combinational explosion with f-strings and r-strings already existing.

Another possibility is to make dedent a string method:

def spam(): text = """\ some text another line and a third """.dedent() print(text)

and avoid the import of textwrap. However, that also imposes a runtime cost, which could be expensive if you are careless:

for x in seq: for y in another_seq: process("""/ some large indented string """.dedent() )

(Note: the same applies to using textwrap.dedent.)

But we could avoid that runtime cost if the keyhole optimizer performed the dedent at compile time:

triple-quoted string literal .dedent()

could be optimized at compile-time, like other constant-folding.

Out of all the options, including the status quo, the one I dislike the least is the last one:

- make dedent a string method;

- recommend (but don't require) that implementations perform the dedent of string literals at compile time;

(failure to do so is a quality of implementation issue, not a bug)

- textwrap.dedent then becomes a thin wrapper around the string method.

On 4/1/18 4:41 AM, Michel Desmoulin wrote:>

...

A "d" prefix to do textwrap.dedent is something I wished for a long time.

It's like the "f" one: we already can do it, be hell is it convenient to have a shortcut.

This is especially if, like me, you take a lot of care in the error messages you give to the user. I write a LOT of them, very long, very descriptive, and I have to either import textwrap or play the concatenation game.

Having a str.dedent() method would be nice, but the d prefix has the huge advantage to be able to dedent on parsing, and hence be more performant.

Steven D'Aprano

4:59 a.m.

New subject: Multi-line string indentation

On Thu, Feb 07, 2019 at 10:13:29AM -0800, Mike Miller wrote:

...

Was: "Dart (Swift) like multi line strings indentation" [...] Anyone still interested in this?

I am, but it will surely need a PEP. I'm not interested enough to write the PEP itself but I'm more than happy to tear it to bits^W^W^W er I mean offer constructive criticism and/or support. -- Steven

Paul Ferrell

7:26 a.m.

New subject: Multi-line string indentation

I particularly like the str.dedent() idea. Adding yet another string prefix adds more complexity to the language, which I'm generally not in favor of. On 2/7/19, Mike Miller <python-ideas@mgmiller.net> wrote:

...

Was: "Dart (Swift) like multi line strings indentation"

This discussion petered-out but I liked the idea, as it alleviates something

occasionally annoying.

Am supportive of the d'' prefix, perhaps the capital prefixes can be deprecated to avoid issues? If not, a sometimes-optimized (or C-accelerated) str.dedent() is acceptable too.

Anyone still interested in this?

-Mike

On 3/31/18 5:43 PM, Steven D'Aprano wrote:

...
The ideal solution would:

- require only a single pair of starting/ending string delimiters;

- allow string literals to be indented to the current block, for the visual look and to make it more convenient with editors which automatically indent;

- evaluate without the indents;

- with no runtime cost.

One solution is to add yet another string prefix, let's say d for dedent, but as Terry and others point out, that leads to a combinational explosion with f-strings and r-strings already existing.

Another possibility is to make dedent a string method:

def spam(): text = """\ some text another line and a third """.dedent() print(text)

and avoid the import of textwrap. However, that also imposes a runtime cost, which could be expensive if you are careless:

for x in seq: for y in another_seq: process("""/ some large indented string """.dedent() )

(Note: the same applies to using textwrap.dedent.)

But we could avoid that runtime cost if the keyhole optimizer performed the dedent at compile time:

triple-quoted string literal .dedent()

could be optimized at compile-time, like other constant-folding.

Out of all the options, including the status quo, the one I dislike the least is the last one:

- make dedent a string method;

- recommend (but don't require) that implementations perform the dedent of string literals at compile time;

(failure to do so is a quality of implementation issue, not a bug)

- textwrap.dedent then becomes a thin wrapper around the string method.

On 4/1/18 4:41 AM, Michel Desmoulin wrote:>

...
A "d" prefix to do textwrap.dedent is something I wished for a long time.

It's like the "f" one: we already can do it, be hell is it convenient to have a shortcut.

This is especially if, like me, you take a lot of care in the error messages you give to the user. I write a LOT of them, very long, very descriptive, and I have to either import textwrap or play the concatenation game.

Having a str.dedent() method would be nice, but the d prefix has the huge advantage to be able to dedent on parsing, and hence be more performant.

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

-- Paul Ferrell pflarr@gmail.com

Joao S. O. Bueno

7:48 a.m.

New subject: Multi-line string indentation

/me also would be strongly in favor of this. "+1 " . Even taking in consideration the added complexity . On Fri, 8 Feb 2019 at 13:26, Paul Ferrell <pflarr@gmail.com> wrote:

...

I particularly like the str.dedent() idea. Adding yet another string prefix adds more complexity to the language, which I'm generally not in favor of.

On 2/7/19, Mike Miller <python-ideas@mgmiller.net> wrote:

...
Was: "Dart (Swift) like multi line strings indentation"

This discussion petered-out but I liked the idea, as it alleviates something

occasionally annoying.

Am supportive of the d'' prefix, perhaps the capital prefixes can be deprecated to avoid issues? If not, a sometimes-optimized (or C-accelerated) str.dedent() is acceptable too.

Anyone still interested in this?

-Mike

On 3/31/18 5:43 PM, Steven D'Aprano wrote:

...
The ideal solution would:

- require only a single pair of starting/ending string delimiters;

- allow string literals to be indented to the current block, for the visual look and to make it more convenient with editors which automatically indent;

- evaluate without the indents;

- with no runtime cost.

One solution is to add yet another string prefix, let's say d for dedent, but as Terry and others point out, that leads to a combinational explosion with f-strings and r-strings already existing.

Another possibility is to make dedent a string method:

def spam(): text = """\ some text another line and a third """.dedent() print(text)

and avoid the import of textwrap. However, that also imposes a runtime cost, which could be expensive if you are careless:

for x in seq: for y in another_seq: process("""/ some large indented string """.dedent() )

(Note: the same applies to using textwrap.dedent.)

But we could avoid that runtime cost if the keyhole optimizer performed the dedent at compile time:

triple-quoted string literal .dedent()

could be optimized at compile-time, like other constant-folding.

Out of all the options, including the status quo, the one I dislike the least is the last one:

- make dedent a string method;

- recommend (but don't require) that implementations perform the dedent of string literals at compile time;

(failure to do so is a quality of implementation issue, not a bug)

- textwrap.dedent then becomes a thin wrapper around the string method.

On 4/1/18 4:41 AM, Michel Desmoulin wrote:>

...
A "d" prefix to do textwrap.dedent is something I wished for a long time.

It's like the "f" one: we already can do it, be hell is it convenient to have a shortcut.

This is especially if, like me, you take a lot of care in the error messages you give to the user. I write a LOT of them, very long, very descriptive, and I have to either import textwrap or play the concatenation game.

Having a str.dedent() method would be nice, but the d prefix has the huge advantage to be able to dedent on parsing, and hence be more performant.

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

-- Paul Ferrell pflarr@gmail.com _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Paul Moore

8:18 a.m.

New subject: Multi-line string indentation

On Thu, 7 Feb 2019 at 18:21, Mike Miller <python-ideas@mgmiller.net> wrote:

...

Anyone still interested in this?

It feels like a nice idea to me, when reading the proposals. However, in all of the code I've ever written in Python (and that's quite a lot...) I've never actually had a case where I this feature would have made a significant difference to the code I wrote. Maybe that says more about my coding style than about the usefulness of the feature, but personally I don't think it's worth it. Paul PS Although I could probably have said something similar about f-strings, and now I use them all the time. Language design is hard ;-)

Chris Angelico

8:28 a.m.

New subject: Multi-line string indentation

On Sat, Feb 9, 2019 at 3:19 AM Paul Moore <p.f.moore@gmail.com> wrote:

...

On Thu, 7 Feb 2019 at 18:21, Mike Miller <python-ideas@mgmiller.net> wrote:

...
Anyone still interested in this?

It feels like a nice idea to me, when reading the proposals. However, in all of the code I've ever written in Python (and that's quite a lot...) I've never actually had a case where I this feature would have made a significant difference to the code I wrote. Maybe that says more about my coding style than about the usefulness of the feature, but personally I don't think it's worth it.

Paul

PS Although I could probably have said something similar about f-strings, and now I use them all the time. Language design is hard ;-)

Yeah, no kidding :-) If someone wants to push this further, I'm happy to assist with the mechanics of writing up a PEP. From my memory, the leading proposals were: 1) Creating a new type of string literal which compiles to a dedented form of multiline string 2a) Adding a str.dedent() method 2b) Creating a constant-folding peephole optimization for methods on immutable literals Either way, the definition of "dedent" would be identical to textwrap.dedent(), meaning that if 2a were to happen, that function could simply "return text.dedent()". Who wants to champion this proposal? ChrisA

Mike Miller

11:07 a.m.

New subject: Multi-line string indentation

Thanks all, I'm willing to start work on a PEP, perhaps next week. Unless Marius would prefer to do it. One fly in the ointment is that I don't feel strongly about the choice of solution 1, 2, or last-minute entry. -Mike

Chris Angelico

11:11 a.m.

New subject: Multi-line string indentation

On Sat, Feb 9, 2019 at 6:08 AM Mike Miller <python-ideas@mgmiller.net> wrote:

...

Thanks all,

I'm willing to start work on a PEP, perhaps next week. Unless Marius would prefer to do it.

One fly in the ointment is that I don't feel strongly about the choice of solution 1, 2, or last-minute entry.

That's not a problem. You can write up the arguments for and against each side fairly, and let them convince you. Early phases of PEPs don't have to have all the specifics set in concrete. Marius or Mike, whoever's going to do this: you'll want to start by reading through PEP 12, taking a copy of that file as a template. https://www.python.org/dev/peps/pep-0012/ https://raw.githubusercontent.com/python/peps/master/pep-0012.rst Feel free to reach out to me or any of the other PEP editors if you need a hand with the mechanics. ChrisA

Christopher Barker

11:56 a.m.

New subject: Multi-line string indentation

not that anyone asked, but I"d only support:

...

2a) Adding a str.dedent() method

and maybe:

...

2b) Creating a constant-folding peephole optimization for methods on immutable literals

and frankly, it's a much lighter lift to get approval than: 1) Creating a new type of string literal which compiles to a dedented form of multiline string Also -- more useful -- dedent() is helpful for non-literals as well. Other than docstrings, the case for literals is pretty small, and docstrings are already auto-cleaned up.

...

in all of the code I've ever written in Python (and that's quite a

...
lot...) I've never actually had a case where I this feature would have made a significant difference to the code I wrote.

exactly -- I do have one case -- but that's for an instructional exercise where we are keeping things simple -- in operational code that string would have been in a file, or database, or .... -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython

Ben Rudiak-Gould

1:30 p.m.

New subject: Multi-line string indentation

...

On 3/31/18 5:43 PM, Steven D'Aprano wrote:

...
But we could avoid that runtime cost if the keyhole optimizer performed the dedent at compile time:

triple-quoted string literal .dedent()

could be optimized at compile-time, like other constant-folding.

There are a lot of expressions of the form constexpr.method(constexpr) that could be safely evaluated at compile time and currently aren't. It also occurred to me that if there were a set.frozen method returning a frozenset, then (with compiler support) you could write frozenset literals without new syntax. And if dict.get and dict.__getitem__ (including the operator syntax for the latter) made compile-time (frozen)dict literals the way "in {...}" already makes frozenset literals, it would give you a functional O(1) switch statement for some simple cases. -- Ben

Michel Desmoulin

April 2018

4:41 a.m.

A "d" prefix to do textwrap.dedent is something I wished for a long time. It's like the "f" one: we already can do it, be hell is it convenient to have a shortcut. This is especially if, like me, you take a lot of care in the error messages you give to the user. I write a LOT of them, very long, very descriptive, and I have to either import textwrap or play the concatenation game. Having a str.dedent() method would be nice, but the d prefix has the huge advantage to be able to dedent on parsing, and hence be more performant. Le 31/03/2018 à 16:50, Marius Räsener a écrit :

...

Hey List,

this is my very first approach to suggest a Python improvement I'd think worth discussing.

At some point, maybe with Dart 2.0 or a little earlier, Dart is now supporting multiline strings with "proper" identation (tried, but I can't find the according docs at the moment. probably due to the rather large changes related to dart 2.0 and outdated docs.)

What I have in mind is probably best described with an Example:

print(""" I am a multiline String. """)

the closing quote defines the "margin indentation" - so in this example all lines would get reduces by their leading 4 spaces, resulting in a "clean" and unintended string.

anyways, if dart or not, doesn't matter - I like the Idea and I think python3.x could benefit from it. If that's possible at all :)

I could also imagine that this "indentation cleanup" only is applied if the last quotes are on their own line? Might be too complicated though, I can't estimated or understand this...

thx for reading, Marius

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

2185

Age (days ago)

2499

Last active (days ago)

List overview

Download

47 comments

23 participants

participants (23)

Ben Rudiak-Gould
Brendan Barnwell
Chris Angelico
Christopher Barker
David Mertz
Eric V. Smith
francismb
Joao S. O. Bueno
Marius Räsener
Michel Desmoulin
Mike Miller
MRAB
Nick Coghlan
Paul Ferrell
Paul Moore
Richard Damon
Rob Cliffe
Robert Vanden Eynde
Ryan Gonzalez
Steve Barnes
Steven D'Aprano
Terry Reedy
Todd

Dart like multi line strings identation

Marius Räsener

Marius Räsener

Marius Räsener

Marius Räsener

Marius Räsener

Marius Räsener

tags

participants (23)