[Python-3000] removing exception .args

Andrew Dalke dalke at dalkescientific.com
Sat Jul 21 16:23:45 CEST 2007


Posting here a expansion of a short discussion I had
after Guido's keynote at EuroPython.  In this email
I propose eliminating the ".args" attribute from the
Exception type.  It's not useful, and supporting it
correctly is complicated enough that it's often not
supported correctly



In Python 2 the base Exception class works like this

 >>> x = Exception("spam", "was", "here")
 >>> x[0]
'spam'
 >>> x.args
('spam', 'was', 'here')
 >>>

In Py3K the [0] index lookup disappears.  This is a
good thing.  Positional lookup like this is rarely useful.

The .args attribute remains.  I don't see the need for
it and propose that it be removed in Py3K.

Why?  The "args" attribute is not useful.  People making
non-trivial Exception subclasses often forget to call
__init__ on the parent exception, and using attribute
lookups is much better than using an index lookup.  That's
the experience of the stat call.

Having support for a single object (almost always a
string) passed into the exception is pragmatically useful,
so I think the base exception class should look like

class Exception(object):
   msg = None
   def __init__(self, msg):
     self.msg = msg
   def __str__(self):
     if self.msg is not None:
       return "%s()" % (self.__class__.__name__,)
     else:
       return "%s(%r)" % (self.__class__.__name__, self.msg)

**

The rest of this email is because I'm detail oriented
and present evidence to back up my assertion.

There are a number of subclasses which should but don't
call the base __init__, generic error reporting software
can't use the "args protocol" for anything.  Pretty much
the only thing a generic error report mechanism (like
traceback and logging) can do is call str() on the exception.


Here are some examples to show that some exceptions in the
standard library don't do a good job of calling the base
class __init__.

   (in HTMLParser.py)

class HTMLParseError(Exception):
     """Exception raised for all parse errors."""

     def __init__(self, msg, position=(None, None)):
         assert msg
         self.msg = msg
         self.lineno = position[0]
         self.offset = position[1]

    (in calender.py)

# Exceptions raised for bad input
class IllegalMonthError(ValueError):
     def __init__(self, month):
         self.month = month
     def __str__(self):
         return "bad month number %r; must be 1-12" % self.month


    (in doctest.py)

class DocTestFailure(Exception):
     ...
     def __init__(self, test, example, got):
         self.test = test
         self.example = example
         self.got = got

     def __str__(self):
         return str(self.test)


Eyeballing the numbers, I think about 1/3rd of the
standard library Exception subclasses with an __init__
forget to call the base class and forget to set
.args and .msg.


For better readability and maintainability, complex
exceptions with multiple parameters should make those
parameters accessible via attributes, and not expect
clients to reach into the args list by position.  All
three classes I just listed defined a new __init__
so that the parameters were available by name.


Here's an exception which does the right thing under
Python2.  By that I meaning that it fully implements
the exception API and it makes the parameters available
as named attributes.  It also protects against
subclasses which forget to call GetoptError.__init__
by defining class attributes.

    (from getopt.py )

class GetoptError(Exception):
     opt = ''
     msg = ''
     def __init__(self, msg, opt=''):
         self.msg = msg
         self.opt = opt
         Exception.__init__(self, msg, opt)

     def __str__(self):
         return self.msg

This is correct, but cumbersome.  Why should we
encourage all non-trivial subclasses to look like this?



Historically there has been a problem with the existing
".args".  The base class implementation of __str__ required
that that attribute be present.  This changed some time
between 2.3 and 2.5.

This change invalidated comments like this in httplib.py

class HTTPException(Exception):
     # Subclasses that define an __init__ must call Exception.__init__
     # or define self.args.  Otherwise, str() will fail.
     pass

which later on hacks around not calling __init__ by doing this

class UnknownProtocol(HTTPException):
     def __init__(self, version):
         self.args = version,
         self.version = version


One last existing example to point out.  urllib2.py uses

class URLError(IOError):
     # URLError is a sub-type of IOError, but it doesn't share any of
     # the implementation.  need to override __init__ and __str__.
     # It sets self.args for compatibility with other EnvironmentError
     # subclasses, but args doesn't have the typical format with  
errno in
     # slot 0 and strerror in slot 1.  This may be better than nothing.
     def __init__(self, reason):
         self.args = reason,
         self.reason = reason

     def __str__(self):
         return '<urlopen error %s>' % self.reason

Again, a hack. This time a hack because EnvironmentError
wants an errno and an errorstring.

 >>> EnvironmentError(2,"This is an error message","sp")
EnvironmentError(2, 'This is an error message')
 >>> err = EnvironmentError(2,"This is an error message","sp")
 >>> err.errno
2
 >>> err.strerror
'This is an error message'
 >>> err.filename
'sp'
 >>>

(Note the small bug; the filename is not shown in str(err) )


In closing, given an arbitrary exception, the only thing you can
hope might work is str(exception).  There's a decent chance that
.args and even .msg aren't present.  Generic exception handling
code cannot expect those attribute to exist, and handlers for
specific type should use named attributes rather than the less
readable/less maintainable position attributes.

Python3K is allowed to be non-backwards compatible.  I
propose getting rid of this useless feature.

				Andrew
				dalke at dalkescientific.com




More information about the Python-3000 mailing list