urllib slow on FreeBSD 4.7? sockets too

Mike Brown mike at skew.org
Thu Nov 21 20:10:48 EST 2002


"Bengt Richter" <bokr at oz.net> wrote in message
news:arjk6q$ih5$0 at 216.39.172.122...
> how about (untested)
>
>     f = open('file2.ext', 'w')
>     u = urllib.urlopen("http://192.168.0.4/index.asp?file=name")
>     f.writelines(u.xreadlines())
>     f.close()

The object returned by urlopen() does not have an xreadlines method, so that
won't work. I don't have much new info to contribute, but here are some more
tests.

As was pointed out, all the delay is in the reading, and it really only
happens when dealing with these file-wrapped sockets as returned by
urlopen(). They can be on the localhost, even.

import urllib, time
starttime = time.time()
u = urllib.urlopen('http://localhost/4MB_file')
bytes = u.read()
u.close()
endtime = time.time()
elapsed = endtime - starttime
length = len(bytes)
print "bytes: %d; time: %0.3fs (%0d KB/s)" % (length, elapsed, length /
1024. / elapsed)

Result on Mandrake 8.1, Python 2.1.1:
bytes: 4343332; time: 0.141s (29992 KB/s)
...and subsequent runs are about DOUBLE that speed.

Result on FreeBSD 4.7, Python 2.2.1 (on comparable hardware):
bytes: 4343332; time: 9.404s (451 KB/s)
...and subsequent runs are the same speed.

If it really is an allocation problem, I thought perhaps you could look at
the number of bytes ahead of time via u.headers['content-length'] and pass
that to u.read() so that it could allocate what it needed, but doing so
results in no improvement.

Finally, I'm not sure a demo like the following is all that revealing, but
sockets on FreeBSD (where you'd think they'd be stellar) do not perform all
that well. Nevertheless, this doesn't explain the exceptionally slow
throughput (under 500 KB/s) for a read() on the urlopen()-produced object.
Any suggestions for tuning performance would be much appreciated.

% cat server.py
from socket import *
import time

sock = socket(AF_INET, SOCK_STREAM)
sock.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
sock.bind(('',8090))
sock.listen(512)

def listen():
    try:
        (conn, addr) = sock.accept()
        starttime = time.time()
        fd = conn.makefile('r')
        bytes = fd.read()
        elapsed = time.time() - starttime
        l = len(bytes)
        print "read %d bytes in %0.3fs (%d KB/s)" % (l, elapsed,
l/1024./elapsed)
        conn.close()
    except:
        raise

while 1:
    listen()


% cat client.py
from socket import *
import time

def openSocket(port):
    sock = None
    try:
        sock = socket(AF_INET, SOCK_STREAM)
        sock.connect(('',port))
    except:
        pass
    return sock

def send(sock, bytes):
  chunks = 0
  l = len(bytes)
  n = l
  offset = 0
  while offset < l and n > 0:
    chunks += 1
    n = sock.send(bytes[(offset):]);
    offset += n
  return chunks

data = 'a'*10240
starttime = time.time()
for i in range(10,60):
    sock = openSocket(8090)
    sent = data * i
    l = len(sent)
    print 'Sending %d bytes...' % l,
    chunks = send(sock, sent)
    if chunks > 1:
        print '(needed %d chunks)' % chunks
    else:
        print
    sock.close()
endtime = time.time()
print "elapsed time: %0.3f" % (endtime - starttime)


Run server.py in one window, then client.py in another, on the same machine.

On the FreeBSD 4.7-STABLE machine, you need 2 to 17 chunks each time, and
this is what the receiving end looks like:
read 501760 bytes in 0.053s (9160 KB/s)
read 512000 bytes in 0.041s (12338 KB/s)
read 522240 bytes in 0.059s (8707 KB/s)
read 532480 bytes in 0.064s (8092 KB/s)
read 542720 bytes in 0.058s (9163 KB/s)
read 552960 bytes in 0.068s (7927 KB/s)
read 563200 bytes in 0.070s (7879 KB/s)
read 573440 bytes in 0.054s (10460 KB/s)
read 583680 bytes in 0.065s (8770 KB/s)
read 593920 bytes in 0.074s (7788 KB/s)
read 604160 bytes in 0.078s (7520 KB/s)

On Linux (Mandrake 8.1; on a comparable machine) everything is consistently
sent in one chunk, and you get numbers like these on the receiving end:
read 501760 bytes in 0.012s (42179 KB/s)
read 512000 bytes in 0.006s (83111 KB/s)
read 522240 bytes in 0.009s (58620 KB/s)
read 532480 bytes in 0.009s (59626 KB/s)
read 542720 bytes in 0.009s (58273 KB/s)
read 552960 bytes in 0.009s (59419 KB/s)
read 563200 bytes in 0.009s (59254 KB/s)
read 573440 bytes in 0.010s (54012 KB/s)
read 583680 bytes in 0.011s (52269 KB/s)
read 593920 bytes in 0.011s (53108 KB/s)
read 604160 bytes in 0.007s (81740 KB/s)







More information about the Python-list mailing list