[Python-ideas] Implicit string literal concatenation considered harmful (options)
Ron Adam
ron3200 at gmail.com
Mon May 20 18:46:28 CEST 2013
On 05/19/2013 05:33 PM, Nick Coghlan wrote:
> If it's based on the contents of these threads, be aware that at least one
> core developer (me) and probably more have already mostly tuned out on the
> grounds that the feature is obviously in wide enough use that changing it
> will break the world without adequate gain. We don't even have to speculate
> on what others might be doing, we know it would break *our* code.
Ok, so is it your opinion, that in order to remove implicit string joining,
that an explicit replacement must be put in at the same time?
> For example, porting Fedora to Python 3 is already going to be a pain.
> Breaking implicit string concatenation would be yet another road block
> making that transition more difficult.
This sounds more like a general request to not make any changes, rather
than something about the specific item it self.
To be clear, this is going to need a long removal schedule. Nothing will
probably be actually be removed before 3.7 or later. Maybe two years from now?
How about this:
First, lets please differentiate string continuation from string
concatenation. A string continuation to be a pre-run-time alteration. A
string concatenation to be a run time operation.
By documenting them that way, it will help make them easier to discuss and
teach to new users.
Redefine a line continuation character to be strictly a \+\n sequence.
That removes the "character after line continuation" errors because a '\'
without a newline after it isn't technically a line continuation character.
Then use the '\' except when it's at the end of a line to be the explicit
string continuation character.
This should be easy to do also.
We could add this in sooner rather than later. I don't think it would be a
difficult patch, and I also don't think it would break anything. Implicit
string continuations could be depreciated at the same time with the
recommendation to start using the more explicit variation.
*But not remove implicit string continuations until Python 4.0.*
String continuations are a similar concept to line continuations, so the
reuse of '\' for it is an easy concept to learn and remember. It's also
easy to explain. This does not change a '\' used inside a string. String
escape codes have their own rules.
Examples:
foo('a' 'b'): # This won't cause an error until Python 4.0
x = 'foo\n' \ 'bar\n' \ 'baz\n'
x = ( 'foo\n' # easy to see trailing commas here.
\ 'bar\n'
\ 'baz\n'
)
x = 'foo\n' \
\ 'bar\n' \
\ 'baz\n'
If we allow \+newline to work as both a string continuation and line
continuation, this could be...
x = 'foo\n' \
'bar\n' \
'baz\n'
This is probably the least disruptive way to do this, and the '\' as a
string continuation, is consistent with the \+\n as a line continuation.
A final note ...
I think we can easily allow comments after line continuations if there is
no space between the '\' and the '#'.
x = 'foo\n' \# This comment is removed.
'bar\n' \# The new-line at the end is not removed.
'baz\n'
If when the tokenizer finds a '\' followed by a '#', then it could remove
the comment, backup one, and continue. What would happen is the
\+comment+\n would be converted to \+\n. No space can be between the '\'
and '#' for this to work.
Seems like this should already work, but the current check for an invalid
character after a line continuation raises an error before this can happen.
Cheers,
Ron
More information about the Python-ideas
mailing list