[Edu-sig] How to Improve code performance

Brent Burley Brent.Burley@disney.com
Thu, 01 Nov 2001 15:59:10 -0800


Thanks for the input - I hadn't really thought about StringIO.  Now that
I've taken a look ...

StringIO is no different than what I've shown - it appends each string
to a list and then uses string.join to build the final result.  There's
also additional overhead since it must support random access file
operations.

cStringIO should be faster, but it might not be.  It allocates a single
string buffer and reallocates it to be larger whenever it runs out of
room (doubling the previous size).  You end up reallocating and copying
the string log2(n) times.  If you were building a very large string, the
difference could be significant.

	Brent

Martijn Faassen wrote:
> 
> Brent Burley wrote:
> [snip]
> > The big problem is that tab=tab+"..." in a loop is expensive since
> > Python must allocate a new string and copy the old one each time.  The
> > running time will increase with the square of the number of records,
> > i.e. O(n*n).
> >
> > It may not be intuitive, but putting all the partial strings in a list
> > and then joining them at the end is more efficient as the running time
> > will increase only linearly with the number of records, i.e. O(n).
> 
> An alternative is to use StringIO (or cStringIO if you're not dealing
> with unicode strings):
> 
> I believe cStringIO can be even faster than the .join() operation,
> but I haven't timed it.