[Baypiggies] urllib2.urlopen() and exception layering

Keith Dart keith at dartworks.biz
Sun Apr 6 08:32:09 CEST 2008

On Sat, 5 Apr 2008, David Cramer wrote:

> Ive had that issue with a number of the Python built-in modules. Most of
> them have very poor exception handling, and HTTP is probably one of the
> worst I've dealt with.

I believe, as a general rule, you should not intercept exceptions and 
re-raise them because that makes debugging harder. It can mask the real 

But I suppose it depends on the nature of the exact module or problem, and 
how well the re-raising function reports the underlying error (if at all), 
and whether or not it makes available the original traceback object.

One way to do that:

except IOError:
   ex, val, tb = sys.exc_info()
   raise ProtocolError, val, tb

This way the consumer of the module can get a consistent exception object, 
but still debug it with the real traceback. Are there any downsides to 

> On Sat, Apr 5, 2008 at 2:48 PM, Niall O'Higgins <niallo at unworkable.org>
> wrote:
>> Hi,
>> I have written a number of long-running Python programs which make heavy
>> use of
>> urllib2.urlopen() to fetch HTTP URLs in a convenient manner.  While the
>> documentation states that this function "Raises URLError on errors", I
>> have
>> found this to be incorrect.  So far I have encountered the following
>> exceptions
>> being thrown by urllib2.urlopen() on HTTP URLs:
>>    * urllib2.HTTPError
>>    * urllib2.URLError
>>    * httplib.BadStatusLine
>>    * httplib.InvalidURL
>>    * ValueError
>>    * IOError
>> Looking at the urllib2 module source, it is unclear to me whether the
>> intention
>> is that all httplib errors should be caught and raised as URLError or
>> whether
>> the programmer is expected to handle the underlying exceptions himself.
>> For example, at least socket.error is caught by urllib2.urlopen() and
>> raised as a URLError.  The comment in the code block suggests some
>> confusion:
>>        try:
>>            h.request(req.get_method(), req.get_selector(), req.data,
>> headers)
>>            r = h.getresponse()
>>        except socket.error, err: # XXX what error?
>>            raise URLError(err)
>> I think this is a problem which needs to be addressed, at the very least
>> through clearer documentation, and possibly by improving urllib2 to handle
>> more
>> of these exceptions and raise them as URLError.
>> I'm new to the Python development community, but would be happy to submit
>> a
>> patch if there is some consensus on which approach to take.
>> Thanks!
>> --
>> Niall O'Higgins
>> Software Enthusiast
>> http://niallohiggins.com
>> _______________________________________________
>> Baypiggies mailing list
>> Baypiggies at python.org
>> To change your subscription options or unsubscribe:
>> http://mail.python.org/mailman/listinfo/baypiggies
> --
> David Cramer
> Director of Technology
> iBegin
> http://www.ibegin.com/

-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Keith Dart <keith at dartworks.biz>
    public key: ID: 19017044

More information about the Baypiggies mailing list