urllib function

Robin Karpeta rmkarp at karfaz.nildram.co.uk
Sun May 4 08:30:12 EDT 2003


Hi,
I wrote some code that 'screenscrapes' stuff from websites.  I have been 
  running this on Red Hat Linux 7.3 and now 9.0, with the same results.

Here is the code that reads the data from the URL, it will read from the 
URL in variable u and write to the file whose name is held in variable fn:

def read_one_pic(u, fn):
	data = urllib.urlopen(u).read()
	f = open(fn,'w')
	f.write(data)
	f.close	

Under Python 1.5 this has always worked perfectly, however under later 
versions (2.1, 2.2) although the code still works it is VERY slow.  When 
I ran tcpdump I noticed that while there were no visible error messages 
under Python 1.5, under the later versions there were many entries like 
the one below:

11:09:47.617396 db2.home.sys > radius.nildram.co.uk: icmp: db2.home.sys 
udp port 32806 unreachable [tos 0xc0]

I have been through the manuals but have not found anything relating to 
this.  I am probably not aware of something really simple, but would 
appreciate help on this.

Many thanks
Robin





More information about the Python-list mailing list