[Patches] Patch: AttributeError and NameError: second attempt.

Nick Mathewson nickm@MIT.EDU
Fri, 26 May 2000 05:47:33 -0400


"M.-A. Lemburg" <mal@lemburg.com> wrote:
 [...]
>The problem with storing objects in error objects is that they
>can easily create (well) hidden circular references and cause
>unwanted side-effects: 
>
>1. A NameError that is being catched will
>have a reference to the frame which currently executes (and this
>references the variable holding the error object) -- the frame
>will stay alive forever if the error object is not properly
>cleaned up.

Do you mean 'caught', or 'cached'?  Catching is no problem, I think;
I've experimented, and not been able to generate reference leaks with
NameErrors.  Caching is a problem, however.  Perhaps rather than
incorporating the frame as part of the error object, we should just
pull it from the traceback when we try to print the error.

The only way that the error object won't get 'cleaned up', if I
understand it, is if somebody stores a copy of the exception object
for future reference.  I haven't seen a lot of code that does this
with NameError, but I agree it would be a problem if anybody does
this now.

>2. An AttributeError carrying around the object that caused
>the error could cause finally-clauses to fail due to some
>resource being kept alive or open.

Hm.  This is a valid concern.  I'm not sure what to do about it.
Does anybody have any good suggestions?

>I'd suggest not adding too much logic to the error objects
>themselves, but rather to the code writing the tracebacks.

I tried this approach (not putting logic in the error objects) with
the first version of my patch, but it seems that my options are really
limited: if I do the work when the exception is raised, then I'll
incur a big performance hit for every AttributeError.  If I do the
work when the exception is displayed (as I do now), then I think I
really must keep a reference to the object around until it's
displayed.  

There _may_ exist a some-now-some-later approach, but I'm not sure
how to divide the work.

>Since the above tricks are mainly intended to provide better
>user feedback (which is implemented by writing a traceback),
>this solves the problem without causing additional side-effects
>or severe slow-downs. There are *very* many instances where
>e.g. AttributeErrors are raised only to be catched and then
>causing a different processing branch to be taken -- I would
>strongly object if this action would slow down significantly
>because I use this a lot !

This is related Guido's objection to the first version of my patch;
the second version addresses this concern, but at the expense of
putting adding a reference to the object to the AttributeError object.

BTW, to address your speed concerns (and my own curiousity) I tried a
completely artificial benchmarkmark test, before and after my changes.
========================================
benchmark 1: Does nothing but raise AttributeErrors.  Never prints them.
   class E: pass
   e = E()
   for i in xrange(500000):
      try:
         e.notThere
      except:
         pass

modified python:          22.790u 0.030s
cvs python, unmodified:   32.920u 0.000s
python 1.6a2:             39.850u 0.060s
python 1.5.2:             33.380u 0.020s 

Benchmark 2: Raises a bunch of exceptions, and _does_ simulate printing them.
   class E: pass
   e = E()
   for i in xrange(500000):
      try:
         e.notThere
      except AttributeError, z:
         str(z)

modified python:         119.800u 0.080s
cvs python, unmodified:   46.530u 0.550s
python 1.6a2:             62.900u 0.050s
python 1.5.2:             51.950u 0.020s
========================================
Performance conclusions:

1) Doing all the work at print time makes it much faster to raise exceptions.
2) Doing lots of work at print time makes it much slower to print exceptions.
3) For the load you describe, the way I've written it is better, assuming
   that the reference issues can be resolved.

-- 
Nick