On several occasions I have run into code that will do something like the following with a multiline string:
def some_func():
x, y = process_something() val = """
<xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing>
</xml>
""" % (x, y)
return val
To me, this is rather ugly because it messes up the indentation of some_func(). Suppose we could have a multiline string, that when started on a line indented four spaces, ignores the first four spaces on each line of the literal when creating the actual string?
In this example, I will use four quotes to start such a string. I think the syntax for this could vary though. It would be something like this:
def some_func():
x, y = process_something() val = """" <xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing> </xml> """" % (x, y) return val
That way, the indentation in the function would be preserved, making everything easy to scan, and the indentation in the output would not suffer.
What do you all think?
IIRC Scala has triple-quoted strings with an added feature for this. We should ask some Scala users if they like it.
On Thu, Nov 4, 2010 at 5:21 PM, Daniel da Silva ddasilva@umd.edu wrote:
On several occasions I have run into code that will do something like the following with a multiline string:
def some_func(): x, y = process_something()
val = """
<xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing>
</xml> """ % (x, y)
return val
To me, this is rather ugly because it messes up the indentation of some_func(). Suppose we could have a multiline string, that when started on a line indented four spaces, ignores the first four spaces on each line of the literal when creating the actual string?
In this example, I will use four quotes to start such a string. I think the syntax for this could vary though. It would be something like this:
def some_func(): x, y = process_something()
val = """" <xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing> </xml> """" % (x, y)
return val
That way, the indentation in the function would be preserved, making everything easy to scan, and the indentation in the output would not suffer.
What do you all think?
Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
Daniel da Silva writes:
To me, this is rather ugly because it messes up the indentation of some_func(). Suppose we could have a multiline string, that when started on a line indented four spaces, ignores the first four spaces on each line of the literal when creating the actual string?
We do.
from textwrap import dedent def some_func(): x, y = process_something()
val = dedent("""\ <xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing> </xml> """) % (x, y)
return val
I don't think the function call is ugly enough to fix with syntax.
On Fri, 05 Nov 2010 10:10:14 +0900 "Stephen J. Turnbull" stephen@xemacs.org wrote:
Daniel da Silva writes:
To me, this is rather ugly because it messes up the indentation of some_func(). Suppose we could have a multiline string, that when started on a line indented four spaces, ignores the first four spaces on each line of the literal when creating the actual string?
We do.
from textwrap import dedent def some_func(): x, y = process_something()
val = dedent("""\ <xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing> </xml> """) % (x, y) return val
I don't think the function call is ugly enough to fix with syntax.
It's just reversing the point of view. The right thing would be to have Python code correctly aligned on indentation _by default_. And possibly provide an alternative for cases (which?) where one would want indent not taken into account.
Denis
-- -- -- -- -- -- -- vit esse estrany ☣
spir.wikidot.com
spir writes:
It's just reversing the point of view. The right thing would be to have Python code correctly aligned on indentation _by default_.
This is plausible but I don't necessarily agree that it's the right thing.
If I'm writing a program which writes structured text like XML, which is conventionally indented, I actually prefer
print("""\ <xml> A really uninteresting XML document. </xml> """)
to
print("""\ <xml> A really uninteresting XML document. </xml> """)
For one thing, in many realistic cases it's probably actually formatted like this:
print("""\ <xml> """) print("""\ A really uninteresting XML document. """) print("""\ </xml> """)
and it would be a real PITA trying to get that right.
On Fri, Nov 5, 2010 at 11:10 AM, Stephen J. Turnbull stephen@xemacs.org wrote:
> To me, this is rather ugly because it messes up the indentation of > some_func(). Suppose we could have a multiline string, that when started on > a line indented four spaces, ignores the first four spaces on each line of > the literal when creating the actual string? I don't think the function call is ugly enough to fix with syntax.
I do use the textwrap.dedent workaround myself, but I think it is sufficiently flawed for a proper fix to be worth considering:
1. It doesn't work for docstrings (as Tal pointed out) 2. It postpones until runtime an operation that could fairly easily be carried out at compile time instead
Note that a method on str objects fixes none of those problems, so isn't much of a gain from this point of view. As new string prefix would handle the task nicely, though. I personally like "d for dedent", with all d-strings (even single-quoted ones) being implicitly multiline as the colour for that particular bikeshed:
def some_func(): x, y = process_something()
val = d"\ <xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing> </xml> ") % (x, y)
return val
I'm no more than +0 on the idea though. It strikes me as an awful lot of effort in implementing, documenting and promoting the idea for something that provides at best a minimal improvement in aesthetics and performance.
Cheers, Nick.
On 11/5/2010 10:45 AM, Nick Coghlan wrote:
I do use the textwrap.dedent workaround myself, but I think it is sufficiently flawed for a proper fix to be worth considering:
- It doesn't work for docstrings (as Tal pointed out)
This does:
def f(x): "Am I a docstring\n"\ "even though I start in pieces?\n"\ "Oh, x is a dummy param\n" pass print(f.__doc__)
# prints 3 lines, but not without '' escape at line ends
- It postpones until runtime an operation that could fairly easily be
carried out at compile time instead
Above is. Not that I like the extra fluff needed to make it work. It makes a 'd' for dedent literal prefix look inviting.
On Fri, Nov 5, 2010 at 10:37 AM, Terry Reedy tjreedy@udel.edu wrote:
On 11/5/2010 10:45 AM, Nick Coghlan wrote:
I do use the textwrap.dedent workaround myself, but I think it is sufficiently flawed for a proper fix to be worth considering:
- It doesn't work for docstrings (as Tal pointed out)
This does:
def f(x): "Am I a docstring\n"\ "even though I start in pieces?\n"\ "Oh, x is a dummy param\n" pass print(f.__doc__)
# prints 3 lines, but not without '' escape at line ends
Can the parser be changed to do automagic joining on that the same way that it auto joins ("a"
"b")? I can't imagine any scenario where anyone would rely on starting the line after a docstring with a string literal, since the only plausible run time effect it might have would be to intern some strings for later. Even "string".method() wouldn't have any effect without something like x = in front of.
-- Carl Johnson
On 11/5/10 7:48 PM, Carl M. Johnson wrote:
On Fri, Nov 5, 2010 at 10:37 AM, Terry Reedytjreedy@udel.edu wrote:
On 11/5/2010 10:45 AM, Nick Coghlan wrote:
I do use the textwrap.dedent workaround myself, but I think it is sufficiently flawed for a proper fix to be worth considering:
- It doesn't work for docstrings (as Tal pointed out)
This does:
def f(x): "Am I a docstring\n"\ "even though I start in pieces?\n"\ "Oh, x is a dummy param\n" pass print(f.__doc__)
# prints 3 lines, but not without '' escape at line ends
Can the parser be changed to do automagic joining on that the same way that it auto joins ("a"
"b")? I can't imagine any scenario where anyone would rely on starting the line after a docstring with a string literal, since the only plausible run time effect it might have would be to intern some strings for later. Even "string".method() wouldn't have any effect without something like x = in front of.
def f(x):
... ("Am I a docstring\n" ... "even though I start in pieces?\n" ... "Oh, x is a dummy param\n") ... pass ...
print(f.__doc__)
Am I a docstring even though I start in pieces? Oh, x is a dummy param
"Carl M. Johnson" cmjohnson.mailinglist@gmail.com writes:
Can the parser be changed to do automagic joining on that the same way that it auto joins ("a"
"b")?
There's no need to change the parser, when what you suggest works fine in existing Python *and* is easier to maintain without the trailing backslashes:
def f(x): ("Am I a docstring\n" "even though I start in pieces?\n" "Oh, x is a dummy param\n") pass print(f.__doc__)
But that's still crap. I think the existing situation works too: the existing docstring processors already know to handle docstrings according to URL:http://www.python.org/dev/peps/pep-0257/#id20.
Introducing more workarounds for multi-line strings, when the existing tools appear to work fine, gets a -0.5 from me.
I like the idea of have d""" as the marker, as I've often wanted this for writing out custom HTML or source code.
But, would the level of dedent be based on the level of indent of the code line containing the d""" marker? If so, would you want a parsing/syntax error if the string violated the indentation levels of the surrounding code? (the dedent would have to take place post-parsing, since the grammar knows nothing about INDENT tokens inside strings, so it's not truly a "parser" error).
The easiest would be to do the equivalent of dedent and take the longest common whitespace length. But, the purist in me would kinda hope that if, for example, the code was indented to column 8 but the text was all at column 12, that the resulting string would have 4 leading spaces on each line.
Jared
On 5 Nov 2010, at 07:45, Nick Coghlan wrote:
On Fri, Nov 5, 2010 at 11:10 AM, Stephen J. Turnbull stephen@xemacs.org wrote:
To me, this is rather ugly because it messes up the indentation of some_func(). Suppose we could have a multiline string, that when started on a line indented four spaces, ignores the first four spaces on each line of the literal when creating the actual string?
I don't think the function call is ugly enough to fix with syntax.
I do use the textwrap.dedent workaround myself, but I think it is sufficiently flawed for a proper fix to be worth considering:
- It doesn't work for docstrings (as Tal pointed out)
- It postpones until runtime an operation that could fairly easily be
carried out at compile time instead
Note that a method on str objects fixes none of those problems, so isn't much of a gain from this point of view. As new string prefix would handle the task nicely, though. I personally like "d for dedent", with all d-strings (even single-quoted ones) being implicitly multiline as the colour for that particular bikeshed:
def some_func(): x, y = process_something()
val = d"\
<xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing> </xml> ") % (x, y)
return val
I'm no more than +0 on the idea though. It strikes me as an awful lot of effort in implementing, documenting and promoting the idea for something that provides at best a minimal improvement in aesthetics and performance.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
On 05/11/2010 20:43, Jared Grubb wrote:
I like the idea of have d""" as the marker, as I've often wanted this for writing out custom HTML or source code.
But, would the level of dedent be based on the level of indent of the code line containing the d""" marker? If so, would you want a parsing/syntax error if the string violated the indentation levels of the surrounding code? (the dedent would have to take place post-parsing, since the grammar knows nothing about INDENT tokens inside strings, so it's not truly a "parser" error).
The easiest would be to do the equivalent of dedent and take the longest common whitespace length. But, the purist in me would kinda hope that if, for example, the code was indented to column 8 but the text was all at column 12, that the resulting string would have 4 leading spaces on each line.
[snip] You could also add an 'indent' method to strings:
text.indent(4) would indent each line by 4 spaces
text.indent(-4) would dedent each line by 4 spaces; an exception would be raised if a line started with fewer than 4 spaces, unless that line contained only spaces.
text.indent(4, '>') would indent each line by 4 * '>'.
Jared Grubb writes:
The easiest would be to do the equivalent of dedent and take the longest common whitespace length. But, the purist in me would kinda hope that if, for example, the code was indented to column 8 but the text was all at column 12, that the resulting string would have 4 leading spaces on each line.
-1
You could probably come up with rules to get this right for the parser, but how can it possibly guess exactly what any given user means by
def amethod (self): self.bmethod_has_fleece_as_white_as_snow("""\ Mary had a little lamb little lamb little lamb The lamb may be a little small But this poem is just plain lame.""")
Is that supposed to be flush left, indented 4 spaces, indented 8 spaces, or indented 12 spaces?
I think the dedent rule (longest common whitespace prefix) makes the most sense.
I would never use a quadruple quote for this if it were ever implemented. A better precedent would be a letter prefix to the quotes similar to what we do for raw strings and bytes constants today. m""" perhaps.
I'm -0.5 on this.
I don't think it is a necessary feature thanks to the textwrap.dedent. But the reason I'm not -1 is that using dedent incurs runtime cost to process the larger string constant to generate the desired one. The best way around this is to declare your large string constants at the module level where you don't need to worry about dedent at all.
-gps
On Thu, Nov 4, 2010 at 6:10 PM, Stephen J. Turnbull stephen@xemacs.orgwrote:
Daniel da Silva writes:
To me, this is rather ugly because it messes up the indentation of some_func(). Suppose we could have a multiline string, that when started
on
a line indented four spaces, ignores the first four spaces on each line
of
the literal when creating the actual string?
We do.
from textwrap import dedent def some_func(): x, y = process_something()
val = dedent("""\ <xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing>
</xml> """) % (x, y)
return val
I don't think the function call is ugly enough to fix with syntax. _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
On Nov 4, 2010, at 5:21 PM, Daniel da Silva wrote:
On several occasions I have run into code that will do something like the following with a multiline string:
def some_func(): x, y = process_something()
val = """
<xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing> </xml> """ % (x, y)
return val
To me, this is rather ugly because it messes up the indentation of some_func(). Suppose we could have a multiline string, that when started on a line indented four spaces, ignores the first four spaces on each line of the literal when creating the actual string?
In this example, I will use four quotes to start such a string. I think the syntax for this could vary though. It would be something like this:
def some_func(): x, y = process_something()
val = """" <xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing> </xml> """" % (x, y) return val
That way, the indentation in the function would be preserved, making everything easy to scan, and the indentation in the output would not suffer.
Why not use textwrap.dedent() to remove the common, leading whitespace:
return dedent(val)
Raymond
Daniel da Silva ddasilva@umd.edu writes:
On several occasions I have run into code that will do something like the following with a multiline string:
def some_func():
x, y = process_something() val = """
<xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing>
</xml> > """ % (x, y) > > return val >
To me, this is rather ugly because it messes up the indentation of some_func(). Suppose we could have a multiline string, that when started on a line indented four spaces, ignores the first four spaces on each line of the literal when creating the actual string?
The standard library time machine to the rescue::
import textwrap
def some_func(): x, y = process_something()
val = textwrap.dedent(""" <xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing>
</xml> """) % (x, y)
return val
I use this technique very often in my code. I'm surprised that it doesn't have a wider share of Python programmers. Spread the knowledge!
Daniel da Silva wrote:
On several occasions I have run into code that will do something like the following with a multiline string:
[...]
To me, this is rather ugly because it messes up the indentation of some_func(). Suppose we could have a multiline string, that when started on a line indented four spaces, ignores the first four spaces on each line of the literal when creating the actual string?
In this example, I will use four quotes to start such a string.
Please no. Three quotes is large enough. Also, four quotes currently is legal: it is a triple-quoted string that begins with a quotation mark. You would be changing that behaviour and likely breaking code.
I don't think we need syntax for this, but if we do, I'd prefer to add a prefix similar to the r"" or u"" syntax. Perhaps w"" to normalise whitespace?
But as I said, I don't think we need syntax for this. I'd be happy if textwrap.dedent() became a built-in string method.
def some_func(): x, y = process_something() val = """ <xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing> </xml> """.dedent() % (x, y) return val
Thanks for all of your replies. I did not know about the dedent() function. I will use it for now on. If anyone else desires to push this, feel free, but I am satisfied.
Daniel
On Thu, Nov 4, 2010 at 9:23 PM, Steven D'Aprano steve@pearwood.info wrote:
Daniel da Silva wrote:
On several occasions I have run into code that will do something like the following with a multiline string:
[...]
To me, this is rather ugly because it messes up the indentation of
some_func(). Suppose we could have a multiline string, that when started on a line indented four spaces, ignores the first four spaces on each line of the literal when creating the actual string?
In this example, I will use four quotes to start such a string.
Please no. Three quotes is large enough. Also, four quotes currently is legal: it is a triple-quoted string that begins with a quotation mark. You would be changing that behaviour and likely breaking code.
I don't think we need syntax for this, but if we do, I'd prefer to add a prefix similar to the r"" or u"" syntax. Perhaps w"" to normalise whitespace?
But as I said, I don't think we need syntax for this. I'd be happy if textwrap.dedent() became a built-in string method.
def some_func(): x, y = process_something() val = """
<xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing> </xml> """.dedent() % (x, y) return val
-- Steven
Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
On Fri, 05 Nov 2010 12:23:09 +1100 Steven D'Aprano steve@pearwood.info wrote:
Daniel da Silva wrote:
On several occasions I have run into code that will do something like the following with a multiline string:
[...]
To me, this is rather ugly because it messes up the indentation of some_func(). Suppose we could have a multiline string, that when started on a line indented four spaces, ignores the first four spaces on each line of the literal when creating the actual string?
In this example, I will use four quotes to start such a string.
Please no. Three quotes is large enough. Also, four quotes currently is legal: it is a triple-quoted string that begins with a quotation mark. You would be changing that behaviour and likely breaking code.
I don't think we need syntax for this, but if we do, I'd prefer to add a prefix similar to the r"" or u"" syntax. Perhaps w"" to normalise whitespace?
But as I said, I don't think we need syntax for this. I'd be happy if textwrap.dedent() became a built-in string method.
def some_func(): x, y = process_something() val = """ <xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing> </xml> """.dedent() % (x, y) return val
Yes, that would be a good workaround. (Does dedent() base it's behaviour on first (non-empty) line?)
Denis -- -- -- -- -- -- -- vit esse estrany ☣
spir.wikidot.com
spir writes:
(Does dedent() base it's behaviour on first (non-empty) line?)
No, it chooses the longest common whitespace prefix in the whole string.
I'm not sure what it does with blank lines. A quick test ... in Python 2.5.5, it ignores them for the purpose of determining the dedent. That is, you can have any amount of whitespace on a blank line, but it doesn't affect the dedent computed.
On Thu, 4 Nov 2010 20:21:37 -0400 Daniel da Silva ddasilva@umd.edu wrote:
On several occasions I have run into code that will do something like the following with a multiline string:
def some_func():
x, y = process_something() val = """
<xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing>
</xml> > """ % (x, y) > > return val >
To me, this is rather ugly because it messes up the indentation of some_func(). Suppose we could have a multiline string, that when started on a line indented four spaces, ignores the first four spaces on each line of the literal when creating the actual string?
In this example, I will use four quotes to start such a string. I think the syntax for this could vary though. It would be something like this:
def some_func():
x, y = process_something() val = """" <xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing> </xml> """" % (x, y) return val
That way, the indentation in the function would be preserved, making everything easy to scan, and the indentation in the output would not suffer.
What do you all think?
I'm +++ for this. Even more in python because one has no choice about indentation. But I posted the same proposal some time ago (together with a note that there is no need for """...""" around multiline strings), and it had no apparent success.
Denis -- -- -- -- -- -- -- vit esse estrany ☣
spir.wikidot.com
Daniel da Silva wrote:
On several occasions I have run into code that will do something like the following with a multiline string:
def some_func(): x, y = process_something()
val = """
<xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing>
</xml> """ % (x, y)
return val
To me, this is rather ugly because it messes up the indentation of some_func(). Suppose we could have a multiline string, that when started on a line indented four spaces, ignores the first four spaces on each line of the literal when creating the actual string?
In this example, I will use four quotes to start such a string. I think the syntax for this could vary though. It would be something like this:
def some_func(): x, y = process_something()
val = """" <xml> <myThing> <val>%s</val> <otherVal>%s</otherVal> </myThing> </xml> """" % (x, y)
return val
That way, the indentation in the function would be preserved, making everything easy to scan, and the indentation in the output would not suffer.
What do you all think?
+1 since this would mean Python would also dedent multi-line doc-strings automatically. I really have having doc-strings indented according to their indentation in the code.
- Tal