restriction on sum: intentional bug?
steven at REMOVE.THIS.cybersource.com.au
Mon Oct 19 05:15:41 CEST 2009
On Sun, 18 Oct 2009 19:52:41 -0700, Ethan Furman wrote:
> This is obviously slow on strings, but mention of that is already in the
> docs, and profiling will also turn up such bottlenecks. Get the code
> working first, then optimize, yes?
Well, maybe. Premature optimization and all, but sometimes you just
*know* something is going to be slow, so you avoid it.
And it's amazing how O(N**2) algorithms can hide for years. Within the
last month or two, there was a bug reported for httplib involving
repeated string concatenation:
I can only imagine that the hidden O(N**2) behaviour was there in the
code for years before somebody noticed it, reported it, spent hours
debugging it, and finally discovered the cause and produced a patch.
The amazing thing is, if you look in the httplib.py module, you see this
# XXX This accumulates chunks by repeated string concatenation,
# which is not efficient as the number or size of chunks gets big.
> We've all seen questions on this
> list with folk using the accumulator method for joining strings, and
> then wondering why it's so slow -- the answer given is the same as we
> would give for sum()ing a list of strings -- use join instead. Then we
> have Python following the same advice we give out -- don't break
> duck-typing, any ensuing errors are the responsibility of the caller.
I'd be happy for sum() to raise a warning rather than an exception, and
to do so for both strings and lists. Python, after all, is happy to let
people shoot themselves in the foot, but it's only fair to give them
warning the gun is loaded :)
More information about the Python-list