[Tutor] Surprised that print("a" "b") gives "ab"
Steven D'Aprano
steve at pearwood.info
Sun Mar 6 10:28:07 EST 2016
On Sun, Mar 06, 2016 at 05:37:44PM +1100, Ben Finney wrote:
> boB Stepp <robertvstepp at gmail.com> writes:
>
> > […] why was this feature implemented?
>
> You've been asking that a lot lately :-) This forum is unlikely to be
> able to give authoritative answers to “why was the language designed the
> way it is?”.
Well, some of us have been reading, and contributing to, the
python-ideas and python-dev mailing lists for many years, and so may
have an idea of the *why* certain features exist. Sometimes the reason
is explained in a PEP ("Python Enhancement Proposal") or in the bug
tracker, or occasionally in one of Guido's blog posts. So feel free to
ask, and if we can answer, we will.
> We are more able to explain what justification there is for a behaviour
> remaining as it is. That's a different question, though.
>
> > Is there a use case where it is more desirable to not have a string
> > concatenation operator explicitly used? The only thing that comes to
> > my mind are strings spread over multiple lines, say
> >
> > print("This will be a ..."
> > "...very long string...")
> >
> > But is this preferable to
> >
> > print("This will be a ..." +
> > "...very long string...")
> >
> > ? I personally prefer the latter, so I am probably missing something.
>
> It is salient to this issue, that the two cases above are semantically
> different.
Hmmm. Well, maybe. The problem is, if it is a difference that makes no
difference, is it still a difference?
> The first one is defined (by the syntax for literals) to create a
> *single* string object. Semantically, the fragments are specifying one
> object in a single step.
This part is certainly true. Implicit concatenation must take place
during the parsing/compiling stage, before any code is run, which
implies that at run time there is only a single string object produced.
> The second is semantically (i.e. by the semantics of how such
> expressions are defined to work) creating two distinct objects, then
> creating a third using an operation, then discarding the first two.
This is the "maybe" part. Ben is correct to point out that according to
the language rules, a line of code like:
s = "Hello " + "World!"
is parsed as:
create the string "Hello "
create the string "World!"
call the + operator on those two string literals
But the thing is, because they must be string literals and not
variables, the compiler also knows that the + operator has to perform
string concatenation. There can't be any side-effects (apart from time
and memory use). So recent versions of CPython at least (and possibly
other Pythons) will perform "constant folding", which is to perform as
much of the work at compile-time as possible, and leading to the
semantically identical result:
create the string "Hello World!"
Is that different from what the language semantics state? Yes, but
there's no to see that difference (except indirectly, in memory usage
and speed).
As Ben will point out, this is not a language guarantee, but it is a
current feature of at least some implementations, that simple arithmetic
expressions and string concatenations involving only literals are done
at compile-time, rather than run-time.
> For me, that makes a big difference. When strings need to be split over
> several lines, it is convenient to be able to express directly in the
> code “these fragments are intended to be all part of the same object”.
> And it helps to see that in other people's code, too.
I concur with Ben on this. I like to use implicit concatentation for
long strings split over multiple lines for readability:
class X:
def method(self):
if some_condition:
raise ValueError(
"This is a long error message which is too"
" long to fit in one line, so I split it into"
" shorter fragments, one per line, and allow"
" implicit concatenation to join them."
)
--
Steve
More information about the Tutor
mailing list