Speed of string += string

achrist at easystreet.com achrist at easystreet.com
Sat Apr 12 22:59:13 EDT 2003


We went around on this one about 4 years ago. Is this a url:

http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&threadm=37054490.E79ADCC9%40easystreet.com&rnum=1&prev=/groups%3Fnum%3D100%26hl%3Den%26lr%3D%26ie%3DISO-8859-1%26as_drrb%3Db%26q%3Dgroup%253Acomp.lang.python%2Binsubject%253Astring%2Bauthor%253Aachrist%2540easystreet.com%26btnG%3DGoogle%2BSearch%26as_mind%3D12%26as_minm%3D5%26as_miny%3D1993%26as_maxd%3D12%26as_maxm%3D4%26as_maxy%3D1999

About 6 messages into the thread, Tim Peters came up with a way that
was about 5 times faster than string.join().

Now, which is preferred:

	if astring == "":

or 

	if len(astring) == 0:

????


Al

Alex Martelli wrote:
> 
> Mads Orbesen Troest wrote:
> 
> > Given that strings are immutable, is it exceedingly slow (and memory
> > spiking) to do a lot of "string += string" operations in one's code? IOW,
> 
> Yes -- it's THE one major performance trap in naively coded Python,
> since, when "a lot" is a sufficiently large N, a lot of += on short
> strings ends up being O(N squared) where O(N) approaches would be
> available.  And turning something potentially O(N) into something
> O(N squared) is the kind of performance hit that profiling won't
> necessarily catch, because it shows up (as unboundedly bad) only
> for N sufficiently large...
> 
> So, it's basically the one performance trap I find it most necessary
> to teach even to total beginners!
> 
> > does each and every += evaluation lead to a new concatenated copy of the
> > previous string (now freed for garbage collection) and the new string, so
> 
> Exactly (it's not necessarily the case that the previous string is freed
> for garbage collection, but whether it is, or not, a new concatenated copy
> is indeed unavoidable).
> 
> > that the longer the string to which stuff is appended is, the longer times
> > each += operation takes?
> 
> Again, exactly -- s += l takes roughly O(len(s)+len(l)) time (as it
> necessarily has to copy the len(s) previous chars followed by the len(l)
> new ones).
> 
> Building up lists of strings with mylist.append(piece), then joining
> them together at the end with ''.join(mylist), is the canonical solution;
> others, which I've seen already pointed out to you, include using cStringIO.
> 
> Alex




More information about the Python-list mailing list