[Python-3000] characters data type

Wed May 3 19:28:08 CEST 2006

On 5/3/06, Josiah Carlson <jcarlson at uci.edu> wrote:
> "Guido van Rossum" <guido at python.org> wrote:
> > I wonder if that's really true. After all you still pay the overhead
> > for the list. In fact, here's a challenge for you: implement += on
> > bytes to be as fast as the list append + later join; or prove that it
> > can't be done.
>
> I don't believe it can be done.  See my sample at the end of this
> message.

OK, point taken, for this particular set of parameters (building a 16
MB string from 1K identical blocks).

But how much slower will the list.append version be if the blocks are
10 bytes instead of 1024? That could make a huge difference. (In fact,
I timed something similar to what you posted, and the doubling
approach is actually faster when the buffer is 256 bytes or less.

My conclusion: we need to agree on a rael benchmark before giving up.

> Note that removing the string[:] copy in the list.append
> version only reduces the running time by about .07 seconds.

That's because a string slice that returns the whole string is
optimized to an INCREF operation. So you were really copying the same
buffer over and over, which adds to locality and makes a huge
difference in memory performance.

> Indeed, I don't need to respond to every part in the message.  However,
> not responding to a valid concern/criticism seems to me like a
> head-in-the-sand approach to disagreements and discussions, which
> certainly doesn't help anyone.

Perhaps what seems a valid concern to you appears to endlessly harping
on the samepremature issue to me. Or perhaps I really *didn't* have
enough time to read everything you wrote and I missed it.

Here's a general guideline for anyone posting here: if you want me to
read your posts, make them short and to the point.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)