urllib2 and threading

Fri May 1 18:57:09 EDT 2009

>>>>> robean <st1999 at gmail.com> (R) wrote:

>R> def get_info_from_url(url):
>R>   """ A dummy version of the function simply visits urls and prints
>R> the url of the page. """
>R>   try:
>R>     page = urllib2.urlopen(url)
>R>   except urllib2.URLError, e:
>R>     print "**** error ****", e.reason
>R>   except urllib2.HTTPError, e:
>R>     print "**** error ****", e.code

There's a problem here. HTTPError is a subclass of URLError so it should
be first. Otherwise when you have an HTTPError (like a 404 File not
found) it will be caught by the "except URLError", but it will not have
a reason attribute, and then you get an exception in the except clause
and the thread will crash.
-- 
Piet van Oostrum <piet at cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: piet at vanoostrum.org