
I heard the call for the P3K PEP April deadline, so I thought I better get this sent off! When I was first exposed to Python, I was delighted that I could do the following;
My proposal for Python3K is to allow string-concatenation via juxtaposition between string-literals, string-variables and expressions that evaluate to strings. Juxtaposition has some precedence in Python (the example above) and also in the awk programming language. If anyone agrees that this is a good idea, then I'd be happy to write up a PEP explaining why I think that implicit string concatenation is better than overloading the plus operator (which this proposal wouldn't deprecate) and more elegant than template strings or string interpolation. Eoghan

Eoghan Murray schrieb:
No, please! The concatenation of string literals is done in the parser. Your proposal would move that to runtime and introduce a "whitespace operator". How would you spell that? How would you overload it? etc. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Collin Winter schrieb:
Thinking in that directing, NO-BREAK SPACE would be a perfect choice for an operator! Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Hi guys, Thanks for your replies: On 4/11/07, Georg Brandl <g.brandl@gmx.net> wrote:
[snip]
No, please! The concatenation of string literals is done in the parser.
Your proposal would move that to runtime [snip...] An implementation detail? [...snip] and introduce a "whitespace operator".
How would you spell that? How would you overload it? etc.
This is exactly what I'm proposing. You could spell it __juxta__ short for juxtaposition or __concat__, and overload it as usual :-) On 11/04/07, Collin Winter <collinw@gmail.com> wrote:
A single-width whitespace operator would just be confusing since PEP
3117 will be using zero-width spaces for the None typedef : )
3117 looks cool, but it is in draft stages so needn't factor. Anyone with any positive reactions? Eoghan

On 4/11/07, Eoghan Murray <eoghan@qatano.com> wrote:
This is exactly what I'm proposing. You could spell it __juxta__ short for juxtaposition or __concat__, and overload it as usual :-)
And if __juxta__ is not defined, it should fall back first on __call__, then __mul__, then __add__. If it binds right-to-left, you could write things like from math import * print (2 sin x + cos x) We might as well make newlines an operator at the same time. There's precedent for this in Haskell, and good synergy--adding the STM monad to Python would solve the GIL problem. You could spell that operator __bind__ or just __>>=__, take your pick. And I think Guido already committed to ripping out the @decorator syntax in Py3k in favor of comment overloading, via __rem__(). Just kidding, of course...
Anyone with any positive reactions?
Eoghan, thanks for taking the time to write. I don't think anyone likes the idea, though. It causes many grammatical problems: should a[0] parse as a.__getitem__(0) or a.__juxta__([0])? What about (foo)(bar)? And while "sin x" would of course mean sin.__juxta__(x), "sin -x" would parse as "sin - x", or sin.__sub__(x). A few extra + signs are a small price to pay. -j

Eoghan Murray schrieb:
A rather involved "detail".
This is a joke, isn't it? You're a bit late... Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On 11 Apr 2007, at 11.01, Collin Winter wrote:
I propose we use the ASCII character 0x07 (BEL) as the concatenation operator. It's invisible, so your code still looks nice and clean, but you know it's there because your text editor will beep at you every time you pass it. :) (Speaking of PEP 3117, I will fight it to the death unless the typedef for Exception is changed to Unicode character 2620 (SKULL AND CROSSBONES) or 2623 (BIOHAZARD SIGN). Brilliant choice for frozenset, though. No longer need I wonder why the Unicode Consortium saw fit to include a snowman character!)

On 11/04/07, Adam Atlas <adam@atlas.st> wrote:
LOL, I'll reply to the funniest put down! The rationale for this is that Python should have one definitive way of concatenating strings. I dislike '+' as a string concatenation operator as I think overloading the meaning of '+' for non-numbers is ugly, and I dislike '%s' string formatting as it perpetuates perhaps obscure C syntax, as well as shunting the variables to the end of the line - hard for a human to parse. Given that __juxta__ isn't going to fly, +1 for complete removal of implicit string concatenation in Py3k Eoghan

On 18 Apr 2007, at 18.43, Jan Claeys wrote:
Heh, yeah, I actually realized immediately after I sent that email that the exact same thing could be said about +. But I don't know... even if + might be confused with an arithmetic operator sometimes, it's what people are used to, and I think it makes sense intuitively. 'Plus', in a very abstract sense, suggests 'put two things together', whether with numbers or strings or anything else for which we have a concept of 'putting together'. ~ doesn't have that advantage. If a programmer coming from pretty much any language sees "foo"+"bar", they're probably going to be able to guess that it's concatenations. If they see "foo"~"bar", it is really not immediately clear what it's doing.

Georg Brandl wrote:
Your proposal would move that to runtime and introduce a "whitespace operator". How would you spell that? How would you overload it? etc.
Using the ____() method, obviously. :-) But seriously, there is no way this is going to fly. Python is not Perl or awk (or SNOBOL). -- Greg

On 4/11/07, Eoghan Murray <eoghan@qatano.com> wrote:
I would support a proposal to remove the implicit concatenation entirely. I suspect it would be shot down for backwards compatibility (even in Py3K), but from a readability standpoint ... I have never seen a string concatentation that would look worse because of a "+". I *have* seen some bugs where a comma was forgotten, and two arguments got invisibly jammed together. That's a pain to debug in C; in python with default values, the interpreter may not even gripe sensibly. -jJ

On 4/11/07, Jim Jewett <jimjjewett@gmail.com> wrote:
Oh. I just realized this happens a lot out here. Where I work, we use scons, and each SConscript has a long list of filenames: sourceFiles = [ 'foo.c', 'bar.c', #...many lines omitted... 'q1000x.c'] It's a common mistake to leave off a comma, and then scons complains that it can't find 'foo.cbar.c'. This is pretty bewildering behavior even if you *are* a Python programmer, and not everyone here is. -j

Jason Orendorff schrieb:
I think that convinces me to support the removal. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On 11 Apr 2007, at 16.15, Jim Jewett wrote:
I would support a proposal to remove the implicit concatenation entirely.
I'd agree with that. The parser can probably just do the same optimization automatically if it gets [string literal] "+" [string literal]. (Or does it already?) Meanwhile, on a similar subject, I have a... strange idea. I'm not sure how easy/hard it would be to parse or how necessary it is, but it's just a thought. Currently, you can do multiline strings a couple of ways: x = '''foo bar baz''' x = 'foo' \ 'bar' \ 'baz' Neither of these seem ideal. Triple-quoting is decent, but it can get ugly if you're using it in an indented block (as you most often will be), since the following lines are considered to start right after the newline, not after the containing block's indentation level. But changing it to the latter behaviour has been discussed before, if I remember correctly, and that didn't seem popular. That's understandable; the current triple-quote multiline behaviour makes sense from a logical point of view, it just doesn't look as nice to have text suddenly fall down to 0 indentation and then jump back to the original indentation level when the quote is over. So anyway, what I'm proposing is the following: x = 'foo 'bar 'baz' In other words, if you start a ' or "-quoted string, and don't close it at the end of the line, you can continue it on the next line. It would be generally equivalent to appending \n, closing the quote, and preceding the physical newline with a backslash. (And inserting a plus sign, if we take Jim's proposal into account.) Not closing a quote and doing something else on the next line (i.e. not starting it with a quote character after any whitespace) would still be a syntax error. This style takes precedent from multi-paragraph quoting style in English: if you end a paragraph without closing a quote, then you continue it by starting the next one with a quote, and you can continue like that until you do have an end-quote. I think it would improve readability/writability for when you need to include multiline text blocks or code blocks. Having to have that \n"+ \ at the end of each line really breaks up the flow, whether of a block of human or computer language text. And having subsequent lines fall to 0 indentation (if you choose to use triple-quotes) breaks up the flow of the surrounding Python code. This seems like a good solution, especially since it has precedent in written English. Any thoughts?

"Adam Atlas" <adam@atlas.st> wrote in message news:BCDF57D3-8555-4230-8ABD-8419A41A8E1C@atlas.st... | | On 11 Apr 2007, at 16.15, Jim Jewett wrote: | > I would support a proposal to remove the implicit concatenation | > entirely. Raymond H. is proposing this for Py3. | I'd agree with that. The parser can probably just do the same | optimization automatically if it gets [string literal] "+" [string | literal]. (Or does it already?) He says it does (not sure which version he meant). | what I'm proposing is the following: | | x = 'foo | 'bar | 'baz' -1 Looks ugly to me ;-) tjr

On Wed, 11 Apr 2007 23:54:11 +0200, Terry Reedy <tjreedy@udel.edu> wrote:
Indeed, I don't really like this syntax. I do like if there'd be a way to spell 'multiline string with indentation chopped off'. The easiest way (syntax-wise) would be to just have tripple quote do that, but that's gonna give backward compat problems. Jan

Jan Kanis wrote:
These cases would be fine: a = """Some text. Some more text.""" def f(x): """"Translates x into Hungarian. Does it quite badly.""" pass This wouldn't: a = """Some text. Some intentionally indented text.""" How often do people rely on those tabs or spaces being preserved? Neil

On 4/12/07, Neil Toronto <ntoronto@cs.byu.edu> wrote:
Jan Kanis wrote:
On Wed, 11 Apr 2007 23:54:11 +0200, Terry Reedy <tjreedy@udel.edu> wrote:
Indeed, I don't really like this syntax. I do like if there'd be a way to spell 'multiline string with indentation chopped off'.
Most of the time, the extra indents are OK. And if they aren't, it is usually OK to start the string with a blank line. (So everything is aligned to left, at least.) Would textwrap.dedent do what you wanted (if it were added to __all__)? Should it have a mode to skip the first line? Should there be a TextWrapper expose it somehow? (My thought would be to optionally call it from within _munge_whitespace.)
a = """Some text. Some intentionally indented text."""
How often do people rely on those tabs or spaces being preserved?
For doctests, mainly, so a consistent change would be OK ... but triple quoted strings are supposed to be almost exactly WYSIWYG. -jJ

Jim Jewett wrote:
For doctests, mainly, so a consistent change would be OK ... but triple quoted strings are supposed to be almost exactly WYSIWYG.
But they're *not* WYSIWYG, according to what you naturally "see" when looking at the code. Not sure about anyone else, but what I see is some lines of text that happen to be indented because the're part of a code block. I don't see the indentation as being an intended part of the string. Does anyone have a use case where they *need* the indentation to be preserved? (As opposed to just not caring whether it's there or not.) -- Greg

Josiah Carlson schrieb:
I've already suggested at one time that a dedent() method be added to strings, which would make it more obvious, but what is one import... Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Georg Brandl wrote:
I'm not sure this is the way to go. IMO string methods should be generic manipulations on strings, and personally I find indenting/dedenting multi-line strings doesn't fit in. For me, a stdlib function is just fine. Ivan Vilata i Balaguer wrote:
I'd rather make it explicit by using some string prefix a la 'r' or 'u', 'i', for instance:
This could be a reasonable solution, but it has some downsides: * It's less readable than a well named function * It's harder to understand for a newbie - a function/method has a docstring, this would have to be looked up in the docs * It's easy to miss while reading code - one small letter making a big difference * It paves the road for making more such string prefixes, and then we'd have to memorize all of them... or consult the docs often -1 from me.

Josiah Carlson wrote:
Does anyone have a use case where they *need* the indentation to be preserved?
Not personally. I think that telling people to use textwrap.dedent() is sufficient.
But it seems crazy to make people do this all the time, when there's no reason not to do it automatically in the first place. -- Greg

Greg Ewing wrote:
Reminds me of ... http://www.artima.com/weblogs/viewpost.jsp?thread=101968 Note that the optional implementation of this has already been put in Python 2.5 just as it said it would be. How about using indenting along with implicit string endings? def foo(...): ``` Just another foo. message = ``` This is a multi- line string + implicit right stripping. print message Just kidding of course. The back-quotes will never be approved. ;-) I don't know what would be the best solution because just about anything I can think of has some sort of side effects in some situations. Maybe if line based editors are ever completely replaced with folding graphic editors it will no longer be a problem because all our multi-line strings can have nice borders around them. Cheers, Ron

On 4/13/07, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Josiah Carlson wrote:
Does anyone have a use case where they *need* the indentation to be preserved?
Not personally. I think that telling people to use textwrap.dedent() is sufficient.
The textwrap methods (including a proposed dedent) might make useful string methods. Short of that (1) Where does this preservation actually hurt? def f(self, arg1): """My DocString ... And I continue here -- which really is what I want. """ I use docstrings online -- and I typically do want them indented like the code. (2) Should literals (or at least strings, or at least docstrings) be decoratable? Anywhere but a docstring, you could just call the function, but ... I suppose it serves the same meta-value is the proposed i(nternational) or t(emplate) strings. def f(...): .... @dedent """ ... ... """ -jJ

Jim Jewett wrote:
(1) Where does this preservation actually hurt?
It hurts because it places a burden on everyone every time they use a triple quoted string to do something about the indentation which is unwanted 99.999% of the time.
I use docstrings online -- and I typically do want them indented like the code.
I don't understand what you mean by that. Can you give an example where an auto-dedented docstring would give an undesirable result? -- Greg

Greg Ewing wrote:
You didn't specify doc strings earlier, Just triple quoted strings in general. I don't think it would be problem for only doc strings. It could probably be done at compile time too. It's not really that different than the -OO option to remove them. Dedenting triple quoted strings in general would cause some problems in (python 2.x) with existing gui interfaces that use triple quoted strings to define their text. Cheers, Ron

Ron Adam wrote:
Triple quoted strings in general is what I had in mind. I was replying to something that seemed to imply that it would cause trouble with docstrings, without being very clear about what the trouble was.
I conjecture that in all such cases, the existing code is already dedenting the string itself. I still haven't seen a real case where a piece of code actually needs the extra indentation. -- Greg

For Py3k, how about changing the definition of triple quoted strings so that indentation is stripped up to the level of the line where the string began? In other words, apply an implicit dedent() to it in the parser. -- Greg

Greg Ewing (el 2007-04-13 a les 11:27:44 +1200) va dir::
I'd rather make it explicit by using some string prefix a la 'r' or 'u', 'i', for instance:
As you see, strings marked with 'i' are dedented to the outer non-blank character, and their first empty line is ignored. I haven't meditated this much, so some questions come to my mind: * Is it really OK to remove the first empty line? * How would this interact with an 'r' prefix? Should initial space be kept then? (This would effectively disable 'i'.) * Should leading space in a line after a continuation backslash really be removed? Of course the proposal can be made a lot better with some insight. What do you think of the basic idea? :: Ivan Vilata i Balaguer @ Welcome to the European Banana Republic! @ http://www.selidor.net/ @ http://www.nosoftwarepatents.com/ @

On 4/12/07, Adam Atlas <adam@atlas.st> wrote:
[snip] So anyway,
-1 on such new syntax. What i usually do is: message = ("yada yada\n" "more yada yada\n" "even more yada.") This works a lot like what you suggest, but with Python's current syntax. If implicit string concatenation were removed, I'd just add a plus sign at the end of each line. This is also a possibility: message = "\n".join([ "yada yada", "more yada yada", "even more yada."]) The latter would work even better with the removal of implicit string concatenation, since forgetting a comma would cause a syntax error instead of skipping a newline. - Tal

Eoghan Murray schrieb:
No, please! The concatenation of string literals is done in the parser. Your proposal would move that to runtime and introduce a "whitespace operator". How would you spell that? How would you overload it? etc. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Collin Winter schrieb:
Thinking in that directing, NO-BREAK SPACE would be a perfect choice for an operator! Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Hi guys, Thanks for your replies: On 4/11/07, Georg Brandl <g.brandl@gmx.net> wrote:
[snip]
No, please! The concatenation of string literals is done in the parser.
Your proposal would move that to runtime [snip...] An implementation detail? [...snip] and introduce a "whitespace operator".
How would you spell that? How would you overload it? etc.
This is exactly what I'm proposing. You could spell it __juxta__ short for juxtaposition or __concat__, and overload it as usual :-) On 11/04/07, Collin Winter <collinw@gmail.com> wrote:
A single-width whitespace operator would just be confusing since PEP
3117 will be using zero-width spaces for the None typedef : )
3117 looks cool, but it is in draft stages so needn't factor. Anyone with any positive reactions? Eoghan

On 4/11/07, Eoghan Murray <eoghan@qatano.com> wrote:
This is exactly what I'm proposing. You could spell it __juxta__ short for juxtaposition or __concat__, and overload it as usual :-)
And if __juxta__ is not defined, it should fall back first on __call__, then __mul__, then __add__. If it binds right-to-left, you could write things like from math import * print (2 sin x + cos x) We might as well make newlines an operator at the same time. There's precedent for this in Haskell, and good synergy--adding the STM monad to Python would solve the GIL problem. You could spell that operator __bind__ or just __>>=__, take your pick. And I think Guido already committed to ripping out the @decorator syntax in Py3k in favor of comment overloading, via __rem__(). Just kidding, of course...
Anyone with any positive reactions?
Eoghan, thanks for taking the time to write. I don't think anyone likes the idea, though. It causes many grammatical problems: should a[0] parse as a.__getitem__(0) or a.__juxta__([0])? What about (foo)(bar)? And while "sin x" would of course mean sin.__juxta__(x), "sin -x" would parse as "sin - x", or sin.__sub__(x). A few extra + signs are a small price to pay. -j

Eoghan Murray schrieb:
A rather involved "detail".
This is a joke, isn't it? You're a bit late... Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On 11 Apr 2007, at 11.01, Collin Winter wrote:
I propose we use the ASCII character 0x07 (BEL) as the concatenation operator. It's invisible, so your code still looks nice and clean, but you know it's there because your text editor will beep at you every time you pass it. :) (Speaking of PEP 3117, I will fight it to the death unless the typedef for Exception is changed to Unicode character 2620 (SKULL AND CROSSBONES) or 2623 (BIOHAZARD SIGN). Brilliant choice for frozenset, though. No longer need I wonder why the Unicode Consortium saw fit to include a snowman character!)

On 11/04/07, Adam Atlas <adam@atlas.st> wrote:
LOL, I'll reply to the funniest put down! The rationale for this is that Python should have one definitive way of concatenating strings. I dislike '+' as a string concatenation operator as I think overloading the meaning of '+' for non-numbers is ugly, and I dislike '%s' string formatting as it perpetuates perhaps obscure C syntax, as well as shunting the variables to the end of the line - hard for a human to parse. Given that __juxta__ isn't going to fly, +1 for complete removal of implicit string concatenation in Py3k Eoghan

On 18 Apr 2007, at 18.43, Jan Claeys wrote:
Heh, yeah, I actually realized immediately after I sent that email that the exact same thing could be said about +. But I don't know... even if + might be confused with an arithmetic operator sometimes, it's what people are used to, and I think it makes sense intuitively. 'Plus', in a very abstract sense, suggests 'put two things together', whether with numbers or strings or anything else for which we have a concept of 'putting together'. ~ doesn't have that advantage. If a programmer coming from pretty much any language sees "foo"+"bar", they're probably going to be able to guess that it's concatenations. If they see "foo"~"bar", it is really not immediately clear what it's doing.

Georg Brandl wrote:
Your proposal would move that to runtime and introduce a "whitespace operator". How would you spell that? How would you overload it? etc.
Using the ____() method, obviously. :-) But seriously, there is no way this is going to fly. Python is not Perl or awk (or SNOBOL). -- Greg

On 4/11/07, Eoghan Murray <eoghan@qatano.com> wrote:
I would support a proposal to remove the implicit concatenation entirely. I suspect it would be shot down for backwards compatibility (even in Py3K), but from a readability standpoint ... I have never seen a string concatentation that would look worse because of a "+". I *have* seen some bugs where a comma was forgotten, and two arguments got invisibly jammed together. That's a pain to debug in C; in python with default values, the interpreter may not even gripe sensibly. -jJ

On 4/11/07, Jim Jewett <jimjjewett@gmail.com> wrote:
Oh. I just realized this happens a lot out here. Where I work, we use scons, and each SConscript has a long list of filenames: sourceFiles = [ 'foo.c', 'bar.c', #...many lines omitted... 'q1000x.c'] It's a common mistake to leave off a comma, and then scons complains that it can't find 'foo.cbar.c'. This is pretty bewildering behavior even if you *are* a Python programmer, and not everyone here is. -j

Jason Orendorff schrieb:
I think that convinces me to support the removal. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On 11 Apr 2007, at 16.15, Jim Jewett wrote:
I would support a proposal to remove the implicit concatenation entirely.
I'd agree with that. The parser can probably just do the same optimization automatically if it gets [string literal] "+" [string literal]. (Or does it already?) Meanwhile, on a similar subject, I have a... strange idea. I'm not sure how easy/hard it would be to parse or how necessary it is, but it's just a thought. Currently, you can do multiline strings a couple of ways: x = '''foo bar baz''' x = 'foo' \ 'bar' \ 'baz' Neither of these seem ideal. Triple-quoting is decent, but it can get ugly if you're using it in an indented block (as you most often will be), since the following lines are considered to start right after the newline, not after the containing block's indentation level. But changing it to the latter behaviour has been discussed before, if I remember correctly, and that didn't seem popular. That's understandable; the current triple-quote multiline behaviour makes sense from a logical point of view, it just doesn't look as nice to have text suddenly fall down to 0 indentation and then jump back to the original indentation level when the quote is over. So anyway, what I'm proposing is the following: x = 'foo 'bar 'baz' In other words, if you start a ' or "-quoted string, and don't close it at the end of the line, you can continue it on the next line. It would be generally equivalent to appending \n, closing the quote, and preceding the physical newline with a backslash. (And inserting a plus sign, if we take Jim's proposal into account.) Not closing a quote and doing something else on the next line (i.e. not starting it with a quote character after any whitespace) would still be a syntax error. This style takes precedent from multi-paragraph quoting style in English: if you end a paragraph without closing a quote, then you continue it by starting the next one with a quote, and you can continue like that until you do have an end-quote. I think it would improve readability/writability for when you need to include multiline text blocks or code blocks. Having to have that \n"+ \ at the end of each line really breaks up the flow, whether of a block of human or computer language text. And having subsequent lines fall to 0 indentation (if you choose to use triple-quotes) breaks up the flow of the surrounding Python code. This seems like a good solution, especially since it has precedent in written English. Any thoughts?

"Adam Atlas" <adam@atlas.st> wrote in message news:BCDF57D3-8555-4230-8ABD-8419A41A8E1C@atlas.st... | | On 11 Apr 2007, at 16.15, Jim Jewett wrote: | > I would support a proposal to remove the implicit concatenation | > entirely. Raymond H. is proposing this for Py3. | I'd agree with that. The parser can probably just do the same | optimization automatically if it gets [string literal] "+" [string | literal]. (Or does it already?) He says it does (not sure which version he meant). | what I'm proposing is the following: | | x = 'foo | 'bar | 'baz' -1 Looks ugly to me ;-) tjr

On Wed, 11 Apr 2007 23:54:11 +0200, Terry Reedy <tjreedy@udel.edu> wrote:
Indeed, I don't really like this syntax. I do like if there'd be a way to spell 'multiline string with indentation chopped off'. The easiest way (syntax-wise) would be to just have tripple quote do that, but that's gonna give backward compat problems. Jan

Jan Kanis wrote:
These cases would be fine: a = """Some text. Some more text.""" def f(x): """"Translates x into Hungarian. Does it quite badly.""" pass This wouldn't: a = """Some text. Some intentionally indented text.""" How often do people rely on those tabs or spaces being preserved? Neil

On 4/12/07, Neil Toronto <ntoronto@cs.byu.edu> wrote:
Jan Kanis wrote:
On Wed, 11 Apr 2007 23:54:11 +0200, Terry Reedy <tjreedy@udel.edu> wrote:
Indeed, I don't really like this syntax. I do like if there'd be a way to spell 'multiline string with indentation chopped off'.
Most of the time, the extra indents are OK. And if they aren't, it is usually OK to start the string with a blank line. (So everything is aligned to left, at least.) Would textwrap.dedent do what you wanted (if it were added to __all__)? Should it have a mode to skip the first line? Should there be a TextWrapper expose it somehow? (My thought would be to optionally call it from within _munge_whitespace.)
a = """Some text. Some intentionally indented text."""
How often do people rely on those tabs or spaces being preserved?
For doctests, mainly, so a consistent change would be OK ... but triple quoted strings are supposed to be almost exactly WYSIWYG. -jJ

Jim Jewett wrote:
For doctests, mainly, so a consistent change would be OK ... but triple quoted strings are supposed to be almost exactly WYSIWYG.
But they're *not* WYSIWYG, according to what you naturally "see" when looking at the code. Not sure about anyone else, but what I see is some lines of text that happen to be indented because the're part of a code block. I don't see the indentation as being an intended part of the string. Does anyone have a use case where they *need* the indentation to be preserved? (As opposed to just not caring whether it's there or not.) -- Greg

Josiah Carlson schrieb:
I've already suggested at one time that a dedent() method be added to strings, which would make it more obvious, but what is one import... Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Georg Brandl wrote:
I'm not sure this is the way to go. IMO string methods should be generic manipulations on strings, and personally I find indenting/dedenting multi-line strings doesn't fit in. For me, a stdlib function is just fine. Ivan Vilata i Balaguer wrote:
I'd rather make it explicit by using some string prefix a la 'r' or 'u', 'i', for instance:
This could be a reasonable solution, but it has some downsides: * It's less readable than a well named function * It's harder to understand for a newbie - a function/method has a docstring, this would have to be looked up in the docs * It's easy to miss while reading code - one small letter making a big difference * It paves the road for making more such string prefixes, and then we'd have to memorize all of them... or consult the docs often -1 from me.

Josiah Carlson wrote:
Does anyone have a use case where they *need* the indentation to be preserved?
Not personally. I think that telling people to use textwrap.dedent() is sufficient.
But it seems crazy to make people do this all the time, when there's no reason not to do it automatically in the first place. -- Greg

Greg Ewing wrote:
Reminds me of ... http://www.artima.com/weblogs/viewpost.jsp?thread=101968 Note that the optional implementation of this has already been put in Python 2.5 just as it said it would be. How about using indenting along with implicit string endings? def foo(...): ``` Just another foo. message = ``` This is a multi- line string + implicit right stripping. print message Just kidding of course. The back-quotes will never be approved. ;-) I don't know what would be the best solution because just about anything I can think of has some sort of side effects in some situations. Maybe if line based editors are ever completely replaced with folding graphic editors it will no longer be a problem because all our multi-line strings can have nice borders around them. Cheers, Ron

On 4/13/07, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Josiah Carlson wrote:
Does anyone have a use case where they *need* the indentation to be preserved?
Not personally. I think that telling people to use textwrap.dedent() is sufficient.
The textwrap methods (including a proposed dedent) might make useful string methods. Short of that (1) Where does this preservation actually hurt? def f(self, arg1): """My DocString ... And I continue here -- which really is what I want. """ I use docstrings online -- and I typically do want them indented like the code. (2) Should literals (or at least strings, or at least docstrings) be decoratable? Anywhere but a docstring, you could just call the function, but ... I suppose it serves the same meta-value is the proposed i(nternational) or t(emplate) strings. def f(...): .... @dedent """ ... ... """ -jJ

Jim Jewett wrote:
(1) Where does this preservation actually hurt?
It hurts because it places a burden on everyone every time they use a triple quoted string to do something about the indentation which is unwanted 99.999% of the time.
I use docstrings online -- and I typically do want them indented like the code.
I don't understand what you mean by that. Can you give an example where an auto-dedented docstring would give an undesirable result? -- Greg

Greg Ewing wrote:
You didn't specify doc strings earlier, Just triple quoted strings in general. I don't think it would be problem for only doc strings. It could probably be done at compile time too. It's not really that different than the -OO option to remove them. Dedenting triple quoted strings in general would cause some problems in (python 2.x) with existing gui interfaces that use triple quoted strings to define their text. Cheers, Ron

Ron Adam wrote:
Triple quoted strings in general is what I had in mind. I was replying to something that seemed to imply that it would cause trouble with docstrings, without being very clear about what the trouble was.
I conjecture that in all such cases, the existing code is already dedenting the string itself. I still haven't seen a real case where a piece of code actually needs the extra indentation. -- Greg

For Py3k, how about changing the definition of triple quoted strings so that indentation is stripped up to the level of the line where the string began? In other words, apply an implicit dedent() to it in the parser. -- Greg

Greg Ewing (el 2007-04-13 a les 11:27:44 +1200) va dir::
I'd rather make it explicit by using some string prefix a la 'r' or 'u', 'i', for instance:
As you see, strings marked with 'i' are dedented to the outer non-blank character, and their first empty line is ignored. I haven't meditated this much, so some questions come to my mind: * Is it really OK to remove the first empty line? * How would this interact with an 'r' prefix? Should initial space be kept then? (This would effectively disable 'i'.) * Should leading space in a line after a continuation backslash really be removed? Of course the proposal can be made a lot better with some insight. What do you think of the basic idea? :: Ivan Vilata i Balaguer @ Welcome to the European Banana Republic! @ http://www.selidor.net/ @ http://www.nosoftwarepatents.com/ @

On 4/12/07, Adam Atlas <adam@atlas.st> wrote:
[snip] So anyway,
-1 on such new syntax. What i usually do is: message = ("yada yada\n" "more yada yada\n" "even more yada.") This works a lot like what you suggest, but with Python's current syntax. If implicit string concatenation were removed, I'd just add a plus sign at the end of each line. This is also a possibility: message = "\n".join([ "yada yada", "more yada yada", "even more yada."]) The latter would work even better with the removal of implicit string concatenation, since forgetting a comma would cause a syntax error instead of skipping a newline. - Tal
participants (15)
-
Adam Atlas
-
Collin Winter
-
Eoghan Murray
-
Georg Brandl
-
Greg Ewing
-
Ivan Vilata i Balaguer
-
Jan Claeys
-
Jan Kanis
-
Jason Orendorff
-
Jim Jewett
-
Josiah Carlson
-
Neil Toronto
-
Ron Adam
-
Tal Einat
-
Terry Reedy