urllib slow on FreeBSD 4.7? sockets too
Andrew MacIntyre
andymac at bullseye.apana.org.au
Sat Nov 23 23:15:12 EST 2002
On Sat, 23 Nov 2002, Mike Brown wrote:
> "Jarkko Torppa" <torppa at staff.megabaud.fi> wrote:
> > Seems that stdio is somehow confused, try this
> >
> > import urllib, time, os
> >
> > starttime = time.time()
> > u = urllib.urlopen('http://localhost/4m')
> > fn = u.fp.fileno()
> > bytea = [ ]
> > while 1:
> > bytes = os.read(fn, 16 * 1024)
> > if bytes == '':
> > break
> > bytea.append(bytes)
> > bytes = ''.join(bytea)
> > u.close()
>
> [...]
>
> Well, look at that...
>
> bytes: 4241.5K; time: 0.322s (13171 KB/s)
>
> That's much better. At least, it now seems to be hitting the socket speed
> cap.
I'm glad that I said (in my previous post) that realloc() _may_ be the
cause of what you're seeing, because its not.
I haven't gotten right to the bottom of the matter, however the following
patch against the 2.2.2 sources makes an enormous difference on my system:
---8<---8<---8<---8<---
*** Lib/httplib.py.orig Mon Oct 7 11:18:17 2002
--- Lib/httplib.py Sun Nov 24 14:44:16 2002
***************
*** 210,216 ****
# See RFC 2616 sec 19.6 and RFC 1945 sec 6 for details.
def __init__(self, sock, debuglevel=0, strict=0):
! self.fp = sock.makefile('rb', 0)
self.debuglevel = debuglevel
self.strict = strict
--- 210,216 ----
# See RFC 2616 sec 19.6 and RFC 1945 sec 6 for details.
def __init__(self, sock, debuglevel=0, strict=0):
! self.fp = sock.makefile('rb', -1)
self.debuglevel = debuglevel
self.strict = strict
---8<---8<---8<---8<---
With the 2.2.2 release source, I get about 113kB/s retrieving a 4MB file
from a localhost URL. With the patch applied, I get 4-5.5MB/s.
This on a FreeBSD 4.4 SMP system (dual Celeron 300A, 128MB RAM) with
ATA66 drives.
The change turns the socket's file object from unbuffered, to buffered
with a default buffer size (which I believe is 1024 bytes).
I don't know what the implications of this change in other circumstances
are, so can't recommend this as a permanent patch. There appears to be no
easy way to set this buffering option from the urllib or even httplib
APIs.
At the moment I don't have the FreeBSD library source readily accessible
to investigate the stdio (specifically fread()) implementation in the
unbuffered case.
--
Andrew I MacIntyre "These thoughts are mine alone..."
E-mail: andymac at bullseye.apana.org.au | Snail: PO Box 370
andymac at pcug.org.au | Belconnen ACT 2616
Web: http://www.andymac.org/ | Australia
More information about the Python-list
mailing list