[Pythonmac-SIG] on idle....idle()...and urllib

Jack Jansen jack@oratrix.nl
Mon, 06 Nov 2000 14:41:21 +0100


> This is the code I'm trying and it starts ok then seems to end up in a
> permanent loop. Using urllib.read() hardly ever gets all the data....
> 
> 
> 
> import urllib
> url1 = 'http://slashdot.org/slashdot.rdf'
> url2 = 'http://hack-the-planet.felter.org/rss.xml'
> url3 ='http://www.tomalak.org/recentTodaysLinks.xml'
> 
> url_list = [url1, url2, url3]
> 
> n =1
> 
> for x in url_list:
>     f = open(str(n), 'w')
>     u = urllib.urlopen(x)
>     data = ''
>     
>     t = 0
>     while (u.readline()) != 0:
>         print u.readline()
>         data = data + u.readline()
>         t = t + 1
>         if t >10000:
>             break

I assume you've massaged this code before publishing it, because in the way 
it's presented here it would drop 2 out of every three lines:-)

What I would do is something like:

def getalldata(url):
	u = urllib.urlopen(url)
	rv = ''
	newdata = u.read()
	while newdata:
		rv = rv + newdata
		newdata = u.read()
	return rv
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm