in Lua 5.2+, there's this string escape that allows you to put "whitespace" (in particular, including newlines) in a string literal, by skipping them entirely. now, unlike lua's long strings, python *does* have escapes in long strings. however, sometimes you have help text: "switches to toml config format. the old 'repos' " #cont "table is preserved as 'repos_old'" and... well yeah you can see what I did to make it work. if I used a long string here, I'd get a newline and a bunch of indentation in the middle. so I propose a \z string escape which lets me write the above as shown below: """switches to toml config format. the old 'repos' \z table is preserved as 'repos_old'""" (side note to avoid unnecessary comments: this help text refers to database tables and migrations. it's not about lua tables.)
Soni L. writes:
so I propose a \z string escape which lets me write the above as shown below:
"""switches to toml config format. the old 'repos' \z table is preserved as 'repos_old'"""
We already have that, if you don't care about left-alignment:
"""123456789\ ... abcdefghi""" '123456789abcdefghi'
So '\z' would only give us removal of indentation whitespace from the string. Too much magic, and it has the same problem that trailing '\' has: most people will have to intentionally look for it to see it (though it's easier to spot than trailing '\' for me). That's a -1 for me. Steve
Hello, On Tue, 16 Jun 2020 14:37:39 +0900 "Stephen J. Turnbull" <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
Soni L. writes:
so I propose a \z string escape which lets me write the above as shown below:
"""switches to toml config format. the old 'repos' \z table is preserved as 'repos_old'"""
We already have that, if you don't care about left-alignment:
"""123456789\ ... abcdefghi""" '123456789abcdefghi'
And if you care about left-alignment, but don't care about extra function call, we have https://docs.python.org/3/library/textwrap.html#textwrap.dedent That said, Java (version 13) puts Python to shame with its multi-line strings, which work "as expected" re: leading indentation out of the box: https://openjdk.java.net/jeps/355 . So, I wouldn't "boo, hiss" someone proposing something like: s = _""" Just imagine, this works like you would expect! """ But my response would be my usual - what Python actually needs is macro/AST preprocessing capability, e.g. support for handling '<any_char>"""' strings in user-defined manner. But we can ship some predefined macros, sure. E.g. '_' as a string prefix (like above) would run a string thru (analog of) textwrap.dedent(). -- Best regards, Paul mailto:pmiscml@gmail.com
On 16.06.20 08:40, Paul Sokolovsky wrote:
Hello,
On Tue, 16 Jun 2020 14:37:39 +0900 "Stephen J. Turnbull" <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
Soni L. writes:
so I propose a \z string escape which lets me write the above as shown below:
"""switches to toml config format. the old 'repos' \z table is preserved as 'repos_old'"""
We already have that, if you don't care about left-alignment:
"""123456789\ ... abcdefghi""" '123456789abcdefghi'
And if you care about left-alignment, but don't care about extra function call, we have https://docs.python.org/3/library/textwrap.html#textwrap.dedent
That said, Java (version 13) puts Python to shame with its multi-line strings, which work "as expected" re: leading indentation out of the box: https://openjdk.java.net/jeps/355 .
So, I wouldn't "boo, hiss" someone proposing something like:
s = _""" Just imagine, this works like you would expect! """
But my response would be my usual - what Python actually needs is macro/AST preprocessing capability, e.g. support for handling '<any_char>"""' strings in user-defined manner. But we can ship some predefined macros, sure. E.g. '_' as a string prefix (like above) would run a string thru (analog of) textwrap.dedent().
Alternatively the compiler could run such pure functions if their arguments consist solely of literals (i.e. perform advanced constant folding). This means the following msg = textwrap.dedent('''abc def ghi''') would be converted already at compile-time. Surely it would be convenient to maintain a list of compatible functions, limited to the builtins and stdlib. Another common use case is `str.split`, e.g. `colors = "red green blue".split()`. This could also be converted at compile-time.
On 2020-06-16 2:37 a.m., Stephen J. Turnbull wrote:
Soni L. writes:
so I propose a \z string escape which lets me write the above as shown below:
"""switches to toml config format. the old 'repos' \z table is preserved as 'repos_old'"""
We already have that, if you don't care about left-alignment:
"""123456789\ ... abcdefghi""" '123456789abcdefghi'
So '\z' would only give us removal of indentation whitespace from the string. Too much magic, and it has the same problem that trailing '\' has: most people will have to intentionally look for it to see it (though it's easier to spot than trailing '\' for me). That's a -1 for me.
Explicit is better than implicit. and \z is more explicit than #cont
Steve
Soni L. writes:
Explicit is better than implicit.
and \z is more explicit than #cont
It's not more explicit. Line continuation in a string literal is perfectly explicit and easy to explain *exactly*: "omit the '\' and the following newline from the string being constructed". You could argue that '\z' is more easily visible ("readable" if you like), but not that it's more explicit. The simplest form of '\z' is "almost easy" to explain exactly ("omit the following newline and all leading whitespace on the next line"). Complication enters when we ask, "but what happens if the character following '\z' is not a newline?" But more generally, string dedenting is hard to explain with much precision, let alone exactly, except in (pseudo)code. '\z' would be even more complicated than textwrap.dedent, because it would apply line by line. Would the indentation that is stripped be string-wide (so that '\z' could work like RFC 822 folding with the intended whitespace at the *left* margin, where it's easy to see), or would each '\z' just strip all the indentation on the next line? Maybe '\z' should replace all the surrounding whitespace with a single space? I kinda like the string-wide idea, but s = textwrap.dedent("""\ This\ is\ a\ silly\ example\ of\ a\ dedented\ and\ folded\ string.\ """) assert s == "This is a silly example of a dedented and unfolded string." works for me, specifically in real examples where *occasionally* for some reason I want to fold a long line. That is, although the trailing '\' at the right margin is not so easy to see, the extra space at the left margin of the next line is easy to see (at least with a reasonable font). Of course, s = "This" " is" " a" ... works even better when what you want is an unfolded one-line string. Steve
On 2020-06-17 7:44 a.m., Stephen J. Turnbull wrote:
Soni L. writes:
Explicit is better than implicit.
and \z is more explicit than #cont
It's not more explicit. Line continuation in a string literal is perfectly explicit and easy to explain *exactly*: "omit the '\' and the following newline from the string being constructed". You could argue that '\z' is more easily visible ("readable" if you like), but not that it's more explicit.
The simplest form of '\z' is "almost easy" to explain exactly ("omit the following newline and all leading whitespace on the next line"). Complication enters when we ask, "but what happens if the character following '\z' is not a newline?"
But more generally, string dedenting is hard to explain with much precision, let alone exactly, except in (pseudo)code. '\z' would be even more complicated than textwrap.dedent, because it would apply line by line. Would the indentation that is stripped be string-wide (so that '\z' could work like RFC 822 folding with the intended whitespace at the *left* margin, where it's easy to see), or would each '\z' just strip all the indentation on the next line? Maybe '\z' should replace all the surrounding whitespace with a single space?
Read the Lua docs. It makes it very simple: "The escape sequence '|\z|' skips the following span of white-space characters, including line breaks; it is particularly useful to break and indent a long literal string into multiple lines without adding the newlines and spaces into the string contents." Note that, in Lua, you can "break" the \z by inserting an \x20. This is because it only cares about literal whitespace.
"foo\z \x20bar" foo bar
which is particularly useful for inserting indents into the string while also using \z. Oh, also, it matches zero-or-more, altho I guess an error would also be compliant with what's written in the manual. (And yes, Lua does leave stuff like this "undefined" all the time. We might want to explicitly specify "[...] span of zero-or-more [...]" tho.) This also gives us two ways of doing indented strings (in Lua): local ugly = "foo\n \z bar\n \z baz" local nicer = "foo\n\z \x20 bar\n\z \x20 baz" Obviously textwrap.dedent is nicer but again I don't propose \z for doing this, but rather for breaking up too long strings into multiple lines. The only point is to break up too long strings into multiple lines without accidentally turning them into (part of) a tuple. Which is why I put that #cont there. Also, that example with #cont? It's inside a tuple. Someone looking at it would see the tuple and the newline and the "missing" comma and would put a comma there so I need to put #cont there and having \z would be a better way to solve that.
I kinda like the string-wide idea, but
s = textwrap.dedent("""\ This\ is\ a\ silly\ example\ of\ a\ dedented\ and\ folded\ string.\ """) assert s == "This is a silly example of a dedented and unfolded string."
works for me, specifically in real examples where *occasionally* for some reason I want to fold a long line. That is, although the trailing '\' at the right margin is not so easy to see, the extra space at the left margin of the next line is easy to see (at least with a reasonable font).
Of course,
s = "This" " is" " a" ...
works even better when what you want is an unfolded one-line string.
Steve
On Wed, Jun 17, 2020 at 6:09 AM Soni L. <fakedme+py@gmail.com> wrote:
This also gives us two ways of doing indented strings (in Lua):
local ugly = "foo\n \z bar\n \z baz" local nicer = "foo\n\z \x20 bar\n\z \x20 baz"
I'm having a really hard time seeing why that is either more readable, or easier to type than: nicest = ("foo" " bar" " baz" ) (my prefered way these days) or nicest = \ """foo bar" baz""" -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On 2020-06-17 12:46 p.m., Christopher Barker wrote:
On Wed, Jun 17, 2020 at 6:09 AM Soni L. <fakedme+py@gmail.com <mailto:fakedme%2Bpy@gmail.com>> wrote:
This also gives us two ways of doing indented strings (in Lua):
local ugly = "foo\n \z bar\n \z baz" local nicer = "foo\n\z \x20 bar\n\z \x20 baz"
I'm having a really hard time seeing why that is either more readable, or easier to type than:
nicest = ("foo" " bar" " baz" ) (my prefered way these days)
I think you missed some \n's here.
or
nicest = \ """foo bar" baz"""
this is ugly. what's wrong with textwrap.dedent? however, I bring up again the original use-case which has nothing to do with textwrap.dedent, or nested indentation. But consider this artificial example: foo = textwrap.dedent(""" This is the help page for foo, a command \z with the following subcommands: bar - A very useful subcommand of foo \z and probably the subcommand you'll \z be using the most. baz - A simple maintenance command \z that you may need to use sometimes. """) Currently you'd have to write it as: foo = ( "This is the help page for foo, a command " "with the following subcommands:\n" " bar - A very useful subcommand of foo " "and probably the subcommand you'll " "be using the most.\n" " baz - A simple maintenance command " "that you may need to use sometimes." ) or if you want it to look "nicer", and don't mind linters shouting at you: foo = ( "This is the help page for foo, a command " "with the following subcommands:\n" " bar - A very useful subcommand of foo " "and probably the subcommand you'll " "be using the most.\n" " baz - A simple maintenance command " "that you may need to use sometimes." ) And I think this sucks.
-CHB
-- Christopher Barker, PhD
Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Wed, Jun 17, 2020 at 8:58 AM Soni L. <fakedme+py@gmail.com> wrote:
I'm having a really hard time seeing why that is either more readable, or easier to type than:
nicest = ("foo" " bar" " baz" ) (my prefered way these days)
I think you missed some \n's here.
indeed I did: nicest = ("foo\n" " bar\n" " baz\n" ) which I agree is not as nice, but still just as good as the proposal.
nicest = \ """foo bar" baz"""
this is ugly.
I agree, and don't usually do that -- it depends a lot on the level of indentation I'm working in -- I would only do that at the top level.
what's wrong with textwrap.dedent?
nothing -- but I left that out because it's a function call -- nothing to do with Python syntax, etc. Though now that you mention is, I really dont like the \z idea -- I just don't see the point. But a simiple way to call (and pre-process detent would be nice: nicest = d"""foo bar baz""" I believe that's been proposed on this lists before -- not sure if it petered out, or was rejected.
however, I bring up again the original use-case which has nothing to do with textwrap.dedent, or nested indentation. But consider this artificial example:
foo = textwrap.dedent(""" This is the help page for foo, a command \z with the following subcommands: bar - A very useful subcommand of foo \z and probably the subcommand you'll \z be using the most. baz - A simple maintenance command \z that you may need to use sometimes. """)
Currently you'd have to write it as:
foo = ( "This is the help page for foo, a command " "with the following subcommands:\n" " bar - A very useful subcommand of foo " "and probably the subcommand you'll " "be using the most.\n" " baz - A simple maintenance command " "that you may need to use sometimes." )
or if you want it to look "nicer", and don't mind linters shouting at you:
foo = ( "This is the help page for foo, a command " "with the following subcommands:\n" " bar - A very useful subcommand of foo " "and probably the subcommand you'll " "be using the most.\n" " baz - A simple maintenance command " "that you may need to use sometimes." )
And I think this sucks.
less than ideal, yes -- but please post the \z version -- is it any better? BTW, if I didn't mind linters yelling at me (and I don't), I'd do that as: foo = ( "This is the help page for foo, a command with the following subcommands:\n" " bar - A very useful subcommand of foo and probably the subcommand you'll be using the most.\n" " baz - A simple maintenance command that you may need to use sometimes." ) Which does not suck as much. And why that has nothing to do with textwrap.detent() I don't know -- that's pretty much EXACTLY what textwrap.detent() is for. I'd also add that large blocks of text really don't belong inline as big literals -- if I had a use for that (and I do, for, e.g. help for command line programs) I"d put it somewhere else: either an external text file, or as literals at the tiop level of a module, where ordinary tripple quoted strings are easy: FOO_HELP = """ This is the help page for foo, a command with the following subcommands: bar - A very useful subcommand of foo and probably the subcommand you'll be using the most. baz - A simple maintenance command that you may need to use sometimes. """ which is totally readable to me. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Wed, Jun 17, 2020 at 09:23:04AM -0700, Christopher Barker wrote:
Though now that you mention is, I really dont like the \z idea -- I just don't see the point. But a simiple way to call (and pre-process detent would be nice:
nicest = d"""foo bar baz"""
I believe that's been proposed on this lists before -- not sure if it petered out, or was rejected.
There's an enhancement on b.p.o to make dedent a string method: https://bugs.python.org/issue36906 Aside from being more convenient to use, it would then allow the peep-hole optimizer to apply the text dedent at compile-time. At the moment \z resolves to a literal backslash followed by a z. So this is a backwards-compatibility breaking change. # Python 3.8 py> print("abcd\z xy") <stdin>:1: SyntaxWarning: invalid escape sequence \z abcd\z xy As you can tell from the SyntaxWarning, there is a plan to eventually make unrecognised escapes an error. Once that occurs, we can start proposing new escapes, but until then, I think any proposal for a new escape sequence is dead in the water. -- Steven
Soni L. writes:
however, I bring up again the original use-case which has nothing to do with textwrap.dedent, or nested indentation.
Currently you'd have to write it as:
foo = ( "This is the help page for foo, a command " "with the following subcommands:\n" " bar - A very useful subcommand of foo " "and probably the subcommand you'll " "be using the most.\n" " baz - A simple maintenance command " "that you may need to use sometimes." )
Which isn't at all bad, though I would write it: foo = ( "This is the help page for foo, a command" " with the following subcommands:" "\n bar - A very useful subcommand of foo" " and probably the subcommand you'll" " be using the most." "\n baz - A simple maintenance command " " that you may need to use sometimes." ) It's a little less beautiful, I guess, but it's the most readable and proofreadable of all the idioms to my eye. But then, I read a *lot* of RFC 822 headers. :-)
And I think this sucks.
This is not a hill I'd be willing to die on, to be honest. All of the idioms are more or "less ugly, and '\z' is not an improvement on most of them. Aside to Chris: d"" or similar to implicitly invoke textwrap.dedent has been brought up at least once in the past (sorry, no cite offhand), and it got the reply you would expect: YAGNI since textwrap.dedent already exists, and if you want a prefix that's less heavy, "from textwrap import dedent as _" (or other short identifier) is your friend.
participants (6)
-
Christopher Barker
-
Dominik Vilsmeier
-
Paul Sokolovsky
-
Soni L.
-
Stephen J. Turnbull
-
Steven D'Aprano