[Python-Dev] Avoid formatting an error message on attribute error

Brett Cannon brett at python.org
Thu Nov 7 19:44:39 CET 2013


On Thu, Nov 7, 2013 at 7:41 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

>
> On 7 Nov 2013 21:34, "Victor Stinner" <victor.stinner at gmail.com> wrote:
> >
> > 2013/11/7 Steven D'Aprano <steve at pearwood.info>:
> > > My initial instinct here was to say that sounded like premature
> > > optimization, but to my surprise the overhead of generating the error
> > > message is actually significant -- at least from pure Python 3.3 code.
> >
> > I ran a quick and dirty benchmark by replacing the error message with
> None.
> >
> > Original:
> >
> > $ ./python -m timeit 'hasattr(1, "y")'
> > 1000000 loops, best of 3: 0.354 usec per loop
> > $ ./python -m timeit -s 's=1' 'try: s.y' 'except AttributeError: pass'
> > 1000000 loops, best of 3: 0.471 usec per loop
> >
> > Patched:
> >
> > $ ./python -m timeit 'hasattr(1, "y")'
> > 10000000 loops, best of 3: 0.106 usec per loop
> > $ ./python -m timeit -s 's=1' 'try: s.y' 'except AttributeError: pass'
> > 10000000 loops, best of 3: 0.191 usec per loop
> >
> > hasattr() is 3.3x faster and try/except is 2.4x faster on such micro
> benchmark.
> >
> > > Given that, I wonder whether it would be worth coming up with a more
> > > general solution to the question of lazily generating error messages
> > > rather than changing AttributeError specifically.
> >
> > My first question is about keeping strong references to objects (type
> > object for AttributeError). Is it an issue? If it is an issue, it's
> > maybe better to not modify the code :-)
> >
> >
> > Yes, the lazy formatting idea can be applied to various other
> > exceptions. For example, TypeError message is usually build using
> > PyErr_Format() to mention the name of the invalid type. Example:
> >
> >         PyErr_Format(PyExc_TypeError, "exec() arg 2 must be a dict, not
> %.100s",
> >                      globals->ob_type->tp_name);
> >
> > But it's not easy to store arbitary C types for PyUnicode_FromFormat()
> > parameters. Argument types can be char*, Py_ssize_t, PyObject*, int,
> > etc.
> >
> > I proposed to modify (first/only) AttributeError, because it is more
> > common to ignore the AttributeError than other errors like TypeError.
> > (TypeError or UnicodeDecodeError are usually unexpected.)
> >
> > >> It would be nice to only format the message on demand. The
> > >> AttributeError would keep a reference to the type.
> > >
> > > Since only the type name is used, why keep a reference to the type
> > > instead of just type.__name__?
> >
> > In the C language, type.__name__ does not exist, it's a char* object.
> > If the type object is destroyed, type->tp_name becomes an invalid
> > pointer. So AttributeError should keep a reference on the type object.
> >
> > >> AttributeError.args would be (type, attr) instead of (message,).
> > >> ImportError was also modified to add a new "name "attribute".
>
> The existing signature continued to be supported, though.
>
> > >
> > > I don't like changing the signature of AttributeError. I've got code
> > > that raises AttributeError explicitly.
> >
> > The constructor may support different signature for backward
> > compatibility: AttributeError(message: str) and AttributeError(type:
> > type, attr: str).
> >
> > I'm asking if anyone relies on AttributeError.args attribute.
>
> The bigger problem is you can't change the constructor signature in a
> backwards incompatible way. You would need a new class method as an
> alternative constructor instead, or else use optional parameters.
>

The optional parameter approach is the one ImportError took for introducing
its `name` and `path` attributes, so there is precedent. Currently it
doesn't use a default exception message for backwards-compatibility, but
adding one wouldn't be difficult technically.

In the case of AttributeError, what you could do is follow the suggestion I
made in http://bugs.python.org/issue18156 and add an `attr` keyword-only
argument and correpsonding attribute to AttributeError. If you then add
whatever other keyword arguments are needed to generate a good error
message (`object` so you have the target, or at least get the string name
to prevent gc issues for large objects?) you could construct the message
lazily in the __str__ method very easily while also doing away with
inconsistencies in the message which should make doctest users happy. Lazy
message creation through __str__ does leave the message out of `args`,
though.

In a perfect world (Python 4 maybe?) BaseException would take a single
argument which would be an optional message, `args` wouldn't exist, and
people called `str(exc)` to get the message for the exception. That would
allow subclasses to expand the API with keyword-only arguments to carry
extra info and have reasonable default messages that were built on-demand
when __str__ was called. It would also keep `args` from just being a
dumping ground of stuff that has no structure except by calling convention
(which is not how to do an API; explicit > implicit and all). IOW the
original dream of PEP 352 (
http://python.org/dev/peps/pep-0352/#retracted-ideas).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20131107/a582e2d5/attachment-0001.html>


More information about the Python-Dev mailing list