[Pythonmac-SIG] on idle....idle()...and urllib
Jack Jansen
jack@oratrix.nl
Mon, 06 Nov 2000 14:41:21 +0100
> This is the code I'm trying and it starts ok then seems to end up in a
> permanent loop. Using urllib.read() hardly ever gets all the data....
>
>
>
> import urllib
> url1 = 'http://slashdot.org/slashdot.rdf'
> url2 = 'http://hack-the-planet.felter.org/rss.xml'
> url3 ='http://www.tomalak.org/recentTodaysLinks.xml'
>
> url_list = [url1, url2, url3]
>
> n =1
>
> for x in url_list:
> f = open(str(n), 'w')
> u = urllib.urlopen(x)
> data = ''
>
> t = 0
> while (u.readline()) != 0:
> print u.readline()
> data = data + u.readline()
> t = t + 1
> if t >10000:
> break
I assume you've massaged this code before publishing it, because in the way
it's presented here it would drop 2 out of every three lines:-)
What I would do is something like:
def getalldata(url):
u = urllib.urlopen(url)
rv = ''
newdata = u.read()
while newdata:
rv = rv + newdata
newdata = u.read()
return rv
--
Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm