Antoine Pitrou wrote:
Simon Cross <hodgestar+pythondev <at> gmail.com> writes:
Well, since the source for _read_chunked includes the comment
# XXX This accumulates chunks by repeated string concatenation, # which is not efficient as the number or size of chunks gets big.
you might gain some speed improvement with minimal effort by gathering the read data chunks into a list and then returning "".join(chunks) at the end.
+1 for trying this. Given differences between platforms in realloc() performance, it might be the reason why it goes unnoticed under Linux but degenerates under Windows.
And how! The following change dropped the download time using httplib to 2.3 seconds: http://svn.python.org/view/python/trunk/Lib/httplib.py?r1=74523&r2=74655
As a sidenote, it is interesting that even an stdlib module makes this mistake and acknowledges it without trying to fix it.
No longer in this case ;-) The fix is applied on the trunk, but the problem still exists on the 2.6 branch, 3.1 branch and 3.2 branch. Which of these should I merge to? I assume all of them? Do I need to update any changelog files or similar to indicate this bug has been fixed? cheers, Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk