urllib2 timeout not working - stalls for an hour or so
Peter Otten
__peter__ at web.de
Fri Sep 2 09:04:29 EDT 2016
Sumeet Sandhu wrote:
> Hi,
>
> I use urllib2 to grab google.com webpages on my Mac over my Comcast home
> network.
>
> I see about 1 error for every 50 pages grabbed. Most exceptions are
> ssl.SSLError, very few are socket.error and urllib2.URLError.
>
> The problem is - after a first exception, urllib2 occasionally stalls for
> upto an hour (!), at either the urllib2.urlopen or response.read stages.
>
> Apparently the urllib2 and socket timeouts are not effective here - how do
> I fix this?
>
> ----------------
> import urllib2
> import socket
> from sys import exc_info as sysExc_info
> timeout = 2
> socket.setdefaulttimeout(timeout)
>
> try :
> req = urllib2.Request(url,None,headers)
> response = urllib2.urlopen(req,timeout=timeout)
> html = response.read()
> except :
> e = sysExc_info()[0]
> open(logfile,'a').write('Exception: %s \n' % e)
> < code that follows this : after the first exception, I try again for a
> few tries >
I'd use separate try...except-s for response = urlopen() and
response.read(). If the problem originates with read() you could try to
replace it with select.select([response.fileno()], [], [], timeout) calls in
a loop.
More information about the Python-list
mailing list