Refreshing of urllib.urlopen()

Nobody nobody at nowhere.com
Thu Feb 4 11:46:47 EST 2010


On Wed, 03 Feb 2010 21:33:08 -0600, Michael Gruenstaeudl wrote:

> I am fairly new to Python and need advice on the urllib.urlopen()  
> function. The website I am trying to open automatically refreshes  
> after 5 seconds and remains stable thereafter. With  
> urllib.urlopen().read() I can only read the initial but not the  
> refreshed page. How can I access the refreshed page via  
> urlopen().read()? I have already tried to intermediate with  
> time.sleep() before invoking .read() (see below), but this does not  
> work.

In all probability, the server is instructing the browser to load a
different URL via either a Refresh: header or a <meta http-equiv="refresh">
tag in the page. You will have to retrieve that information then issue a
request for the new URL.

It might even be redirecting via JavaScript, in which case, you lose (it's
possible to handle this case, but it's difficult).




More information about the Python-list mailing list