[Tutor] Using urllib to retrieve info

Kent Johnson kent37 at tds.net
Mon Aug 8 19:48:42 CEST 2005


David Holland wrote:
> Kent,
> 
> Sorry I should have put my code.
> This is what I wrote
> import urllib
> import urllib2
> f =
> urllib.urlopen("http://support.mywork.co.uk/index.php?node=2371&pagetree=&fromid=20397&objectid=21897").read()
> newfile = open("newfile.html",'w')
> newfile.write(f)
> newfile.close()
> print 'finished'
> 
> It runs fine but the file saved to disk is the
> information at : 'http://support.mywork.co.uk'
> not
> 'http://support.mywork.co.uk/index.php?node=2371&pagetree=&fromid=20397&objectid=21897"'

There is something strange with this site that has nothing to do with Python. Just playing around with the two URLs in Firefox I get different results by reloading the page. If I try loading the two URLs with curl, I get the same thing for both.

Possibly there is something going on with cookies; you might take a look at urllib2 and its support for cookies to see if you can get it working the way you want.

Kent



More information about the Tutor mailing list