Exception as the primary error handling mechanism?

Fri Jan 1 03:26:16 EST 2010

On Thu, 31 Dec 2009 20:47:49 -0800, Peng Yu wrote:

> I observe that python library primarily use exception for error handling
> rather than use error code.
> 
> In the article API Design Matters by Michi Henning
> 
> Communications of the ACM
> Vol. 52 No. 5, Pages 46-56
> 10.1145/1506409.1506424
> http://cacm.acm.org/magazines/2009/5/24646-api-design-matters/fulltext
> 
> It says "Another popular design flaw—namely, throwing exceptions for
> expected outcomes—also causes inefficiencies because catching and
> handling exceptions is almost always slower than testing a return
> value."

This is very, very wrong.

Firstly, notice that the author doesn't compare the same thing. He 
compares "catching AND HANDLING" the exception (emphasis added) with 
*only* testing a return value. Of course it is faster to test a value and 
do nothing, than it is to catch an exception and then handle the 
exception. That's an unfair comparison, and that alone shows that the 
author is biased against exceptions.

But it's also wrong. If you call a function one million times, and catch 
an exception ONCE (because exceptions are rare) that is likely to be 
much, much faster than testing a return code one million times.

Before you can talk about which strategy is faster, you need to 
understand your problem. When exceptions are rare (in CPython, about one 
in ten or rarer) then try...except is faster than testing each time. The 
exact cut-off depends on how expensive the test is, and how much work 
gets done before the exception is raised. Using exceptions is only slow 
if they are common.

But the most important reason for preferring exceptions is that the 
alternatives are error-prone! Testing error codes is the anti-pattern, 
not catching exceptions.

See, for example:

http://c2.com/cgi/wiki?UseExceptionsInsteadOfErrorValues
http://c2.com/cgi/wiki?ExceptionsAreOurFriends
http://c2.com/cgi/wiki?AvoidExceptionsWheneverPossible

Despite the title of that last page, it has many excellent arguments for 
why exceptions are better than the alternatives.

(Be warned: the c2 wiki is filled with Java and C++ programmers who 
mistake the work-arounds for quirks of their language as general design 
principles. For example, because exceptions in Java are evcen more 
expensive and slow than in Python, you will find lots of Java coders 
saying "don't use exceptions" instead of "don't use exceptions IN JAVA".)

There are many problems with using error codes:

* They complicate your code. Instead of returning the result you care 
about, you have to return a status code and the return result you care 
about. Even worse is to have a single global variable to hold the status 
of the last function call!

* Nobody can agree whether the status code means the function call 
failed, or the function call succeeded.

* If the function call failed, what do you return as the result code?

* You can't be sure that the caller will remember to check the status 
code. In fact, you can be sure that the caller WILL forget sometimes! 
(This is human nature.) This leads to the frequent problem that by the 
time a caller checks the status code, the original error has been lost 
and the program is working with garbage.

* Even if you remember to check the status code, it complicates the code, 
makes it less readable, confuses the intent of the code, and often leads 
to the Arrow Anti-pattern: http://c2.com/cgi/wiki?ArrowAntiPattern

That last argument is critical. Exceptions exist to make writing correct 
code easier to write, understand and maintain.

Python uses special result codes in at least two places:

str.find(s) returns -1 if s is not in the string
re.match() returns None is the regular expression fails

Both of these are error-prone. Consider a naive way of getting the 
fractional part of a float string:

>>> s = "234.567"
>>> print s[s.find('.')+1:]
567

But see:

>>> s = "234"
>>> print s[s.find('.')+1:]
234

You need something like:

p = s.find('.')
if p == -1:
    print ''
else:
    print s[p+1:]

Similarly, we cannot safely do this in Python:

>>> re.match(r'\d+', '123abcd').group()
'123'
>>> re.match(r'\d+', 'abcd').group()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'group'

You need to do this:

mo = re.match(r'\d+', '123abcd')
if mo is not None:  # or just `if mo` will work
    mo.group()

Exceptions are about making it easier to have correct code. They're also 
about making it easier to have readable code. Which is easier to read, 
easier to understand and easier to debug?

x = function(1, 2, 3)
if x != -1:
    y = function(x, 1, 2)
    if y != -1:
        z = function(y, x, 1)
        if z != -1:
            print "result is", z
        else:
            print "an error occurred"
    else:
        print "an error occurred"
else:
    print "an error occurred"

versus:

try:
    x = function(1, 2, 3)
    y = function(x, 1, 2)
    print "result is", function(y, x, 1)
except ValueError:
    print "an error occurred"

In Python, setting up the try...except block is very fast, about as fast 
as a plain "pass" statement, but actually catching the exception is quite 
slow. So let's compare string.find (which returns an error result) and 
string.index (which raises an exception):

>>> from timeit import Timer
>>> setup = "source = 'abcd'*100 + 'e'"
>>> min(Timer("p = source.index('e')", setup).repeat())
1.1308379173278809
>>> min(Timer("p = source.find('e')", setup).repeat())
1.2237567901611328

There's hardly any difference at all, and in fact index is slightly 
faster. But what about if there's an exceptional case?

>>> min(Timer("""
... try:
...     p = source.index('z')
... except ValueError:
...     pass
... """, setup).repeat())
3.5699808597564697
>>> min(Timer("""
... p = source.find('z')
... if p == -1:
...     pass
... """, setup).repeat())
1.7874350070953369

So in Python, catching the exception is slower, in this case about twice 
as slow. But remember that the "if p == -1" test is not free. It might be 
cheap, but it does take time. If you call find() enough times, and every 
single time you then test the result returned, that extra cost may be 
more expensive than catching a rare exception.

The general rule in Python is:

* if the exceptional event is rare (say, on average, less than about one 
time in ten) then use a try...except and catch the exception;

* but if it is very common (more than one time in ten) then it is faster 
to do a test.

> My observation is contradicted to the above statement by Henning. If my
> observation is wrong, please just ignore my question below.
> 
> Otherwise, could some python expert explain to me why exception is
> widely used for error handling in python? Is it because the efficiency
> is not the primary goal of python?

Yes.

Python's aim is to be fast *enough*, without necessarily being as fast as 
possible.

Python aims to be readable, and to be easy to write correct, bug-free 
code.

-- 
Steven