Joining strings versus augmented assignment

Bengt Richter bokr at oz.net
Sat Feb 1 12:34:55 EST 2003


On Fri, 31 Jan 2003 19:38:32 -0800, Erik Max Francis <max at alcyone.com> wrote:

>Andy Todd wrote:
>
>> In his Linux Journal piece on Quixote ... Greg Ward mentions
>> that when joining a series of strings together it is more efficient to
>> use "".join(listOfStrings) rather than += (augmented assignment).
>> 
>> I couldn't find any obvious references to this in the documentation or
>> in this group.
>> 
>> Would anyone care to explain why this is so to a poor ignoramous?
>
>As other people have pointed out, it's because augmented assignment
>needs to repeatedly create and destroy objects, whereas S.join can do
IMO we need to keep in mind that the language is not defined by accidental
details of implementation, and try to avoid leading people into believing
that bad peformance for certain operations is necessarily cast in concrete.

IOW, language-wise, what matters is the implementation-independent semantics.
Practically, classifying operations according to expected performance
with current versions is useful, but it is essentially optimization advice,
not language description.

>the concatenation all at the C level and needs to only create one
>object.
AFAIK, there is no rule against compiling Python to generate code that
does lazy joining of strings (nor against special representations and
treatments of small strings that enchance performance for usefully
common cases without changing semantics ;-). Whether it's worth expending
the effort or not is yet another separate issue.
>
>The same precept generally applies to other constructs, such as using
>string formatting (S % T) rather than building the string bit by bit.
>
>	"%d of %s" % (kiom, kio)
>
>is certain to be faster than
Again, you are describing _probable_ performance results for _some_ data
and implementations, not inevitable general consequences of the language design:

===< t_strops.py >========
def percent(kiom, kio):
    "%d of %s" % (kiom, kio)
def add(kiom, kio):
    str(kiom) + " of " + kio
==========================
[ 9:38] C:\pywk\clp>timefuns t_strops -c percent -i 3 -s A -c add -i 3 -s A
           timing oh:  0.000012  ratio
             percent:  0.000043   1.00
                 add:  0.000034   0.80

IOW, for the case of kiom==3 and kio == 'A', the performance prediction
was not that certain ;-)

I don't doubt that you understand the difference, but IMO we should avoid
describing particular cases of bad performance as if they were part of the
language. Especially when replying to newbie questions.
>
>	str(kiom) + " of " + kio
>
>even though they both do the same thing.
>
Um, not _quite_ the same, on several counts ;-)

Regards,
Bengt Richter




More information about the Python-list mailing list