which is better, string concatentation or substitution?

Duncan Booth duncan.booth at invalid.invalid
Mon May 8 04:04:38 EDT 2006


Leif K-Brooks wrote:

> fuzzylollipop wrote:
>> niether .join() is the fastest
> 
> Please quote what you're replying to.
> 
> No, it's the slowest:
> 
> leif at ubuntu:~$ python -m timeit "'<p>%s</p>\n\n' % 'foobar'"
> 1000000 loops, best of 3: 0.607 usec per loop
> leif at ubuntu:~$ python -m timeit "'<p>' + 'foobar' + '</p>\n\n'"
> 1000000 loops, best of 3: 0.38 usec per loop
> leif at ubuntu:~$ python -m timeit "''.join(['<p>', 'foobar', '</p>\n\n'])"
> 1000000 loops, best of 3: 0.817 usec per loop
> 

If you are only concatenating a few strings together, then straight 
concatenation will be faster, but when joining many strings together 
concatenating strings can be much slower compared to join. In the OP's 
original example:

def p(self, paragraph):
     self.source += '<p>' + paragraph + '</p>\n\n'

it is the concatenation to self.source which is could become the 
bottleneck, it doesn't really matter how the text of the paragraph 
assembled.

For most purposes use what looks clearest at the time: it isn't worth the 
hassle of obfuscating your code until you've identified a real cpu hog. On 
the other hand, the str.join idiom is sufficiently common in Python that 
sometimes it wins on clarity and simplicity as well. e.g. If you build a 
list of lines to join then you don't have to repeat '\n' on the end of each 
component line.

BTW, be careful using timeit. I nearly got caught out running your tests:

C:\Python25>python -m timeit "''.join(['<p>', 'foobar', '</p>\n\n'])"
1000000 loops, best of 3: 0.872 usec per loop

C:\Python25>python -m timeit "'<p>' + 'foobar' + '</p>\n\n'"
10000000 loops, best of 3: 0.049 usec per loop

C:\Python25>python -m timeit "'<p>%s</p>\n\n' % 'foobar'"
10000000 loops, best of 3: 0.0495 usec per loop

C:\Python25>cd \python24

C:\Python24>python -m timeit "''.join(['<p>', 'foobar', '</p>\n\n'])"
1000000 loops, best of 3: 1.05 usec per loop

C:\Python24>python -m timeit "'<p>' + 'foobar' + '</p>\n\n'"
1000000 loops, best of 3: 0.359 usec per loop

Spot the really fast concatenations in Python 2.5 which is now detecting 
the constant strings and concatenating them once only. It also does that 
for the string formatting which only leaves poor old join to actually do 
any work in these tests.



More information about the Python-list mailing list