[Python-ideas] Create a StringBuilder class and use it everywhere

Masklinn masklinn at masklinn.net
Mon Aug 29 11:44:13 CEST 2011


On 2011-08-29, at 11:27 , M.-A. Lemburg wrote:
> Dirkjan Ochtman wrote:
>> On Thu, Aug 25, 2011 at 11:45, M.-A. Lemburg <mal at egenix.com> wrote:
>>> I think you should use cStringIO in your class implementation.
>>> The list + join idiom is nice, but it has the disadvantage of
>>> creating and keeping alive many small string objects (with all
>>> the memory overhead and fragmentation that goes along with it).
>> 
>> AFAIK using cStringIO just for string building is much slower than
>> using list.append() + join(). IIRC we tested some micro-benchmarks on
>> this for Mercurial output (where it was a significant part of the
>> profile for some commands). That was on Python 2, of course, it may be
>> better in io.StringIO and/or Python 3.
> 
> Turns our you're right (list.append must have gotten a lot faster
> since I last tested this years ago, or I simply misremembered
> the results).
> 
>> python2.6 teststringbuilding.py array cstringio listappend
> Running test array ...
>   669.68 ms
> Running test cstringio ...
>   563.95 ms
> Running test listappend ...
>   389.22 ms
> 
>> python2.7 teststringbuilding.py array cstringio listappend
> Running test array ...
>   775.32 ms
> Running test cstringio ...
>   679.88 ms
> Running test listappend ...
>   375.19 ms

Converting your code straight to bytes (so array still works) yields this on Python 3.2.1:

    > python3.2 timetest.py io array listappend
    Running test io ...
       334.03 ms
    Running test array ...
       776.66 ms
    Running test listappend ...
       314.90 ms

For string (excluding array):

     > python3.2 timetest.py io listappend
     Running test io ...
        451.45 ms
     Running test listappend ...
        356.39 ms




More information about the Python-ideas mailing list