[Python-Dev] Usage of += on strings in loops in stdlib

Antoine Pitrou solipsis at pitrou.net
Thu Feb 14 08:38:58 CET 2013


On Thu, 14 Feb 2013 01:21:40 +0100
Victor Stinner <victor.stinner at gmail.com> wrote:
> 
> UnicodeWriter (using the "writer += str" API) is the fastest method in
> most cases, except for data = ['a'*10**4] * 10**2 (in this case, it's
> 8x slower!). I guess that the overhead comes for the overallocation
> which then require to shrink the buffer (shrinking may copy the whole
> string). The overallocation factor may be adapted depending on the
> size.

How about testing on Windows?

> If computing the final length is cheap (eg. if it's always the same),
> it's always faster to use UnicodeWriter with a preallocated buffer.

That's not a particularly surprising discovery, is it? ;-)

Regards

Antoine.


More information about the Python-Dev mailing list