StringIO proposal: add __iadd__
aleax at mail.comcast.net
Sun Jan 29 19:21:30 EST 2006
Paul Rubin <http://phr.cx@NOSPAM.invalid> wrote:
> ''.join with a list (rather than a generator) arg may be plain worse
> than python StringIO. Imagine building up a megabyte string one
> character at a time, which means making a million-element list and a
> million temporary one-character strings before joining them.
Absolutely wrong: ''.join takes less for a million items than StringIO
takes for 100,000. It's _so_ easy to measure...!
Nimue:~/pynut alex$ python2.4 -mtimeit 's=["x" for i in xrange(999999)];
10 loops, best of 3: 422 msec per loop
Nimue:~/pynut alex$ python2.4 -mtimeit -s'from StringIO import StringIO'
's=StringIO()' 'for i in xrange(99999): s.write("x")' 'x=s.getvalue()'
10 loops, best of 3: 688 msec per loop
After all, how do you think StringIO is implemented internally? A list
of strings and a ''.join at the end are the best way that comes to mind,
and of course there's going to be overhead (although I'm surprised to
see that the overhead is quite as bad as this). BTW, cStringIO isn't
very good here either:
Nimue:~/pynut alex$ python2.4 -mtimeit -s'from cStringIO import
StringIO' 's=StringIO()' 'for i in xrange(999999): s.write("x")'
10 loops, best of 3: 1.28 sec per loop
three times as slow as the ''.join you hate so much -- if it's to take
its place, it clearly needs a lot of work.
As for sum, you'll recall I was its original proponent, and my first
implementation did specialcase strings (delegating right to ''.join).
But that left O(N**2) behavior in many other cases (lists, tuples) and
eventually was whittled down to "summing *numbers*", at least as far as
the intention goes. Perhaps there's space for a "sumsequences" that's
something like itertools.chain but specialcases crucial cases such as
strings (plain and Unicode) and lists? Good luck getting it approved on
python-dev -- I'll gladly implement it, if you can get it past that
hurdle (chatting about it here is entertaining, but unless you can get
BDFL blessing it's in the end futile, and that requires python-dev...).
More information about the Python-list