[Python-ideas] Create a StringBuilder class and use it everywhere

M.-A. Lemburg mal at egenix.com
Mon Aug 29 12:25:45 CEST 2011


Masklinn wrote:
> On 2011-08-29, at 11:27 , M.-A. Lemburg wrote:
>> Dirkjan Ochtman wrote:
>>> On Thu, Aug 25, 2011 at 11:45, M.-A. Lemburg <mal at egenix.com> wrote:
>>>> I think you should use cStringIO in your class implementation.
>>>> The list + join idiom is nice, but it has the disadvantage of
>>>> creating and keeping alive many small string objects (with all
>>>> the memory overhead and fragmentation that goes along with it).
>>>
>>> AFAIK using cStringIO just for string building is much slower than
>>> using list.append() + join(). IIRC we tested some micro-benchmarks on
>>> this for Mercurial output (where it was a significant part of the
>>> profile for some commands). That was on Python 2, of course, it may be
>>> better in io.StringIO and/or Python 3.
>>
>> Turns our you're right (list.append must have gotten a lot faster
>> since I last tested this years ago, or I simply misremembered
>> the results).
>>
>>> python2.6 teststringbuilding.py array cstringio listappend
>> Running test array ...
>>   669.68 ms
>> Running test cstringio ...
>>   563.95 ms
>> Running test listappend ...
>>   389.22 ms
>>
>>> python2.7 teststringbuilding.py array cstringio listappend
>> Running test array ...
>>   775.32 ms
>> Running test cstringio ...
>>   679.88 ms
>> Running test listappend ...
>>   375.19 ms
> 
> Converting your code straight to bytes (so array still works) yields this on Python 3.2.1:
> 
>     > python3.2 timetest.py io array listappend
>     Running test io ...
>        334.03 ms
>     Running test array ...
>        776.66 ms
>     Running test listappend ...
>        314.90 ms
> 
> For string (excluding array):
> 
>      > python3.2 timetest.py io listappend
>      Running test io ...
>         451.45 ms
>      Running test listappend ...
>         356.39 ms

Unicode works with the array module as well. Just use 'u' as
array code and replace fromstring/tostring with
fromunicode/tounicode.

In any case, the array module approach appears to the
be slowest of all three tests.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 29 2011)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2011-10-04: PyCon DE 2011, Leipzig, Germany                36 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/



More information about the Python-ideas mailing list