[New-bugs-announce] [issue2601] [regression] reading from a urllib2 file descriptor happens byte-at-a-time
Matthias Klose
report at bugs.python.org
Tue Apr 8 23:15:30 CEST 2008
New submission from Matthias Klose <doko at debian.org>:
r61009 on the 2.5 branch
- Bug #1389051, 1092502: fix excessively large memory allocations when
calling .read() on a socket object wrapped with makefile().
causes a regression compared to 2.4.5 and 2.5.2:
When reading from urllib2 file descriptor, python will read the data a
byte at a time regardless of how much you ask for. python versions up to
2.5.2 will read the data in 8K chunks.
This has enough of a performance impact that it increases download time
for a large file over a gigabit LAN from 10 seconds to 34 minutes. (!)
Trivial/obvious example code:
f =
urllib2.urlopen("http://launchpadlibrarian.net/13214672/nexuiz-data_2.4.orig.tar.gz")
while 1:
chunk = f.read()
... and then strace it to see the recv()'s chugging along, one byte at a
time.
----------
assignee: akuchling
components: Library (Lib)
messages: 65219
nosy: akuchling, doko
priority: high
severity: normal
status: open
title: [regression] reading from a urllib2 file descriptor happens byte-at-a-time
type: performance
versions: Python 2.5
__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue2601>
__________________________________
More information about the New-bugs-announce
mailing list