[Baypiggies] urllib2.urlopen() and exception layering

Sun Apr 6 08:32:09 CEST 2008

On Sat, 5 Apr 2008, David Cramer wrote:

> Ive had that issue with a number of the Python built-in modules. Most of
> them have very poor exception handling, and HTTP is probably one of the
> worst I've dealt with.

I believe, as a general rule, you should not intercept exceptions and 
re-raise them because that makes debugging harder. It can mask the real 
problem.

But I suppose it depends on the nature of the exact module or problem, and 
how well the re-raising function reports the underlying error (if at all), 
and whether or not it makes available the original traceback object.

One way to do that:

try:
   SomeOperation()
except IOError:
   ex, val, tb = sys.exc_info()
   raise ProtocolError, val, tb

This way the consumer of the module can get a consistent exception object, 
but still debug it with the real traceback. Are there any downsides to 
this?

>
> On Sat, Apr 5, 2008 at 2:48 PM, Niall O'Higgins <niallo at unworkable.org>
> wrote:
>
>> Hi,
>>
>> I have written a number of long-running Python programs which make heavy
>> use of
>> urllib2.urlopen() to fetch HTTP URLs in a convenient manner.  While the
>> documentation states that this function "Raises URLError on errors", I
>> have
>> found this to be incorrect.  So far I have encountered the following
>> exceptions
>> being thrown by urllib2.urlopen() on HTTP URLs:
>>
>>    * urllib2.HTTPError
>>    * urllib2.URLError
>>    * httplib.BadStatusLine
>>    * httplib.InvalidURL
>>    * ValueError
>>    * IOError
>>
>> Looking at the urllib2 module source, it is unclear to me whether the
>> intention
>> is that all httplib errors should be caught and raised as URLError or
>> whether
>> the programmer is expected to handle the underlying exceptions himself.
>>
>> For example, at least socket.error is caught by urllib2.urlopen() and
>> raised as a URLError.  The comment in the code block suggests some
>> confusion:
>>
>>        try:
>>            h.request(req.get_method(), req.get_selector(), req.data,
>> headers)
>>            r = h.getresponse()
>>        except socket.error, err: # XXX what error?
>>            raise URLError(err)
>>
>> I think this is a problem which needs to be addressed, at the very least
>> through clearer documentation, and possibly by improving urllib2 to handle
>> more
>> of these exceptions and raise them as URLError.
>>
>> I'm new to the Python development community, but would be happy to submit
>> a
>> patch if there is some consensus on which approach to take.
>>
>> Thanks!
>>
>> --
>> Niall O'Higgins
>> Software Enthusiast
>> http://niallohiggins.com
>> _______________________________________________
>> Baypiggies mailing list
>> Baypiggies at python.org
>> To change your subscription options or unsubscribe:
>> http://mail.python.org/mailman/listinfo/baypiggies
>>
>
>
>
> --
> David Cramer
> Director of Technology
> iBegin
> http://www.ibegin.com/
>

-- 
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Keith Dart <keith at dartworks.biz>
    public key: ID: 19017044
    <http://www.dartworks.biz/>
    =====================================================================