Fastest technique for string concatenation

Will Hall wrsh07 at gmail.com
Tue Oct 5 15:48:47 EDT 2010


On Oct 5, 2:39 pm, Will Hall <wrs... at gmail.com> wrote:
> On Oct 3, 8:19 am, Roy Smith <r... at panix.com> wrote:
>
>
>
> > My local news feed seems to have lost the early part of this thread, so
> > I'm afraid I don't know who I'm quoting here:
>
> > > My understanding is that appending to a list and then joining
> > > this list when done is the fastest technique for string
> > > concatenation. Is this true?
>
> > > The 3 string concatenation techniques I can think of are:
>
> > > - append to list, join
> > > - string 'addition' (s = s + char)
> > > - cStringIO
>
> > There is a fourth technique, and that is to avoid concatenation in the
> > first place.   One possibility is to use the common append/join pattern
> > mentioned above:
>
> > vector = []
> > while (stuff happens):
> >    vector.append(whatever)
> > my_string = ''.join(vector)
>
> > But, it sometimes (often?) turns out that you don't really need
> > my_string.  It may just be a convenient way to pass the data on to the
> > next processing step.  If you can arrange your code so the next step can
> > take the vector directly, you can avoid creating my_string at all.
>
> > For example, if all you're going to do is write the string out to a file
> > or network socket, you could user vectored i/o, with something like
> > python-writev (http://pypi.python.org/pypi/python-writev/1.1).  If
> > you're going to iterate over the string character by character, you
> > could write an iterator which does that without the intermediate copy.  
> > Something along the lines of:
>
> >     def each(self):
> >         for s in self.vector:
> >             for c in s:
> >                 yield c
>
> > Depending on the amount of data you're dealing with, this could be a
> > significant improvement over doing the join().
>
> Okay. I've never responded to one of these before, so please correct
> me if I'm making any large blunders.  I'd just recently read Guido's
> Python Patterns -- An Optimization Anecdote, and I was wondering why a
> similar method to the one he suggests wouldn't work here?
>
> My suggestion:
> def arrayConcat():
>     output = array.array('c', source).tostring()
>
> Am I missing something, or will this work?
>
> Thanks,
> Will

My bad, I forgot the 'import array' statement at the top.



More information about the Python-list mailing list