[Python-ideas] Implicit string literal concatenation considered harmful?

Ron Adam ron3200 at gmail.com
Sun May 12 00:19:14 CEST 2013


Greg, I meant to send my reply earlier to the list.


On 05/11/2013 12:39 AM, Greg Ewing wrote:
>> Also, doesn't this imply that ... is now an operator in some contexts,
>  > but a literal in others?

Could it's use as a literal be depreciated?  I haven't seen it used in that 
except in examples.


> It would have different meanings in different contexts, yes.
>
> But I wouldn't think of it as an operator, more as a token
> indicating string continuation, in the same way that the
> backslash indicates line continuation.

Yep, it would be a token that the tokenizer would handle.  So it would be 
handled before anything else just as the line continuation '\' is.   After 
the file is tokenized, it is removed and won't interfere with anything else.

It could be limited to strings, or expanded to include numbers and possibly 
other literals.

     a = "a long text line "...
         "that is continued "...
         "on several lines."

     pi =  3.1415926535...
             8979323846...
             2643383279

You can't do this with a line continuation '\'.


Another option would be to have dedented multi-line string tokens |""" and 
|'''.   Not too different than r""" or b""".

     s = |"""Multi line string
         |
         |paragraph 1
         |
         |paragraph 2
         |"""

     a = |"""\
         |a long text line \
         |that is continued \
         |on several lines.\
         |"""

The rule for this is, for strings that start with |""" or |''', each 
following line needs to be proceeded with whitespace + '|', until the 
closing quote is reached.  The tokenizer would just find and remove them as 
it comes across them.  Any '|' on a line after the first '|' would be 
unaffected, so they don't need to be escaped.

IT's a very explicit syntax. It's very obvious what is part of the string 
and what isn't.  Something like this would end the endless debate on 
dedents.  That alone might be worth it.   ;-)

I know the | is also a binary 'or' operator, but it's use for that is in a 
different contex, so I don't think it would be a problem.

Both of these options would be implemented in the tokenizer and are really 
just tools to formatting source code rather than actual additions or 
changes to the language.

Cheers,
    Ron














More information about the Python-ideas mailing list