[Tutor] Concatenating Strings

Steven D'Aprano steve at pearwood.info
Tue May 29 07:51:21 CEST 2012


On Mon, May 28, 2012 at 07:07:20PM -0700, Steve Willoughby wrote:
> On 28-May-12 19:00, Jeremy Duenas wrote:
> >and the both printed the same output……so why would I want to use 
> >‘+’ to
> >add strings if there seems to be no reason too?
> 
> Juxtaposing strings only works with constants, which may be convenient
> in some cases, but it won't work at all when concatenating other string 
> values.
> 
> a="hello"
> b="world"
> 
> a+b    # will yield "helloworld" but
> a b    # is a syntax error
> 
> Using + is arguably preferable when you have a choice to make, since it 
> works in all cases, including constants.

I'll argue differently: even though + works with string literals as well 
as variables, you shouldn't use it.

Python the language promises that implicit concatenation of literals 
will be performed at compile time. That is, if you write source code:

print("hello " "world")  # implicit concatenation

any implementation of Python (such as PyPy, IronPython, Jython, etc.) 
must ensure that the literals "hello " and "world" are concatenated at 
compile-time into a single string "hello world", not at run-time. But 
the same is not the case for + concatenation:

print("hello " + "world")  # explicit concatenation

which may be done either at compile-time, or at run-time, depending on 
the implementation and version of Python, or the presence of 
optimizations, or the phase of the moon for all we know.

Now, normally this won't matter. The nanosecond or so that it takes to 
concatenate two short strings like that is trivial. Even if it happens 
at run-time, who will care? But the *intent* is more clear: implicit 
concatenation clearly states that these substrings belong together, in a 
way that + doesn't. (At least in my mind.)

Furthermore, there are times where run-time concatentation of literals 
does add some small, but measurable, cost:

for i in range(10000000):
    print("hello " + "world")
    do_something_useful(i)

In this case, versions of Python that delay the concatenation to 
run-time will have to add the two substrings not once, but ten million 
times. Versions of Python that do it at compile time only need to add 
them together once.

Of course, nobody sensible should be concatenating such small strings 
like that when they could just write "hello world" as a single string. 
Why would you bother? But this is a useful feature when you have longer 
strings:

print("This is some longer string, where it is not appropriate to "
      "use a triple-quoted string because it adds linebreaks, but "
      "the string is too long to fit on a single line. In this case "
      "implicit string concatenation is the ideal solution to the "
      "problem. Don't forget to leave a space at the end of each "
      "line (or the beginning if you prefer).")

In cases like this, the difference between a compile-time concatenation 
and run-time may be significant, especially inside a fast loop.



-- 
Steven


More information about the Tutor mailing list