Re: [Python-Dev] with_traceback

At 03:38 PM 2/26/2007 -0700, Andrew Dalke wrote:
Guido's talk at PyCon said:
Use raise E(arg).with_traceback(tb) instead of raise E, arg, tb
That seems strange to me because of the mutability. Looking through the back discussions on this list I see Guido commented: http://mail.python.org/pipermail/python-3000/2007-February/005689.html
Returning the mutated object is acceptable here because the *dominant* use case is creating and raising an exception in one go:
raise FooException(<args>).with_traceback(<tb>)
The 3 argument raise statement is rarely used, in my experience. I believe most don't even know it exists, excepting mostly advanced Python programmers and language lawyers.
My concern when I saw Guido's keynote was the worry that people do/might write code like this
NO_END_OF_RECORD = ParserError("Cannot find end of record")
def parse_record(input_file): ... raise NO_END_OF_RECORD ...
That is, create instances at the top of the module, to be used later. This code assume that the NO_END_OF_RECORD exception instance is never modified.
If the traceback is added to its __traceback__ attribute then I see two problems if I were to write code like the above:
- the traceback stays around "forever" - the code is no longer thread-safe.
Then don't do that, as it's bad style for Python 3.x. ;-) But do note that 3-argument raise should NOT be implemented this way in Python 2.x. 2.6 and other 2.x revisions should still retain the existing raise machinery, it's just that *catching* an exception using 3.x style ("except foo as bar:") should call with_traceback() at the time of the catch. This does mean you won't be able to port your code to 3.x style until you've gotten rid of shared exception instances from all your dependencies, but 3.x porting requires all your dependencies to be ported anyway. It should be sufficient in both 2.x and 3.x for with_traceback() to raise an error if the exception already has a traceback -- this should catch any exception instance reuse.
What is the correct way to rewrite this for use with "with_traceback"? Is it
def open_file_on_path(name): # If nothing exists, raises an exception based on the # first attempt saved_err = None for dirname in _PATH: try: return open(os.path.join(dirname, name)) except Exception, err: if not saved_err: saved_err = err saved_tb = sys.exc_info()[2] raise saved_err.with_traceback(saved_err.__traceback__)
No, it's more like this: try: for dirname in ... try: return ... except Exception as err: saved_err = err raise saved_err finally: del saved_err I've added the outer try-finally block to minimize the GC impact of the *original* code you showed, as the `saved_tb` would otherwise have created a cycle. That is, the addition is not because of the porting, it's just something that you should've had to start with. Anyway, the point here is that in 3.x style, most uses of 3-argument raise just disappear altogether. If you hold on to an exception instance, you have to be careful about it for GC, but no more so than in current Python. The "save one instance and use it forever" use case is new to me - I've never seen nor written code that uses it before now. It's definitely incompatible with 3.x style, though.

Phillip J. Eby wrote:
At 03:38 PM 2/26/2007 -0700, Andrew Dalke wrote:
NO_END_OF_RECORD = ParserError("Cannot find end of record")
Then don't do that, as it's bad style for Python 3.x. ;-)
I don't like that answer. I can think of legitimate reasons for wanting to pre-create exceptions, e.g. if I'm intending to raise and catch a particular exception frequently and I don't want the overhead of creating a new instance each time. For me, this is casting serious doubt on the whole idea of attaching the traceback to the exception. -- Greg

On Tue, 27 Feb 2007 13:37:21 +1300, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I don't like that answer. I can think of legitimate reasons for wanting to pre-create exceptions, e.g. if I'm intending to raise and catch a particular exception frequently and I don't want the overhead of creating a new instance each time.
This seems like kind of a strange micro-optimization to have an impact on a language change discussion. Wouldn't it be better just to optimize instance creation overhead? Or modify __new__ on your particular heavily-optimized exception to have a free-list, so it can be both correct (you can still mutate exceptions) and efficient (you'll only get a new exception object if you really need it).
For me, this is casting serious doubt on the whole idea of attaching the traceback to the exception.
I'm sorry if this has been proposed already in this discussion (I searched around but did not find it), but has anyone thought of adding methods to Exception to handle these edge cases and *not* attempting to interact with the 'raise' keyword at all? This is a strawman, but: except Foo as foo: if foo.some_property: foo.reraise(modify_stack=False) This would make the internal implementation details less important, since the 'raise' keyword machinery will have to respect some internal state of the exception object in either case, but the precise thing being raised need not be the result of the method, nor the exception itself.

Glyph:
This seems like kind of a strange micro-optimization to have an impact on a language change discussion.
Just as a reminder, my concern is that people reuse exceptions (rarely) and that the behavior of the "with_exceptions()" method is ambiguous when that happens. It has nothing to do with optimization. The two solutions of: 1. always replace an existing __traceback__ 2. never replace an existing __traceback__ both seem to lead to problems. Here are minimal examples for thought: # I know PJE says this is bad style for 3.0. Will P3K help # identify this problem? If it's allowable, what will it do? # (Remember, I found existing code which reuses exceptions # so this isn't purely theoretical, only rare.) BAD = Exception("that was bad") try: raise BAD except Exception: pass raise BAD # what traceback will be shown here? (Oh, and what would a debugger report?) # 2nd example; reraise an existing exception instance. # It appears that any solution which reports a problem # for #1 will not allow one or both of the following. try: raise Exception("bad") except Exception as err: first_err = err try: raise Exception("bad") except Exception: raise first_err # what traceback will be shown here? # 3rd example, like the 2nd but end it with raise first_err.with_exception(first_err.__traceback__) # what traceback will be shown here?
I'm sorry if this has been proposed already in this discussion (I searched around but did not find it),
I saw references to a PEP about it but could not find the PEP. Nor could I find much discussion either. I would like to know the details. I assume that "raise" adds the __traceback__ if it's not None, hence there's no way it can tell if the __traceback__ on the instance was created with "with_traceback()" from an earlier "raise" or from the with_traceback. But in the current examples it appears that the Exception class could attach a traceback during instantiation and "with_traceback" simply replaces that. I doubt this version, but cannot be definitive. While variant method/syntax may improve matters, I think people will write code as above -- all of which are valid Python 2.x and 3.x -- and end up with strange and hard to interpret tracebacks. Andrew dalke@dalkescientific.com

On Feb 28, 2007, at 1:50 AM, Andrew Dalke wrote:
Glyph:
This seems like kind of a strange micro-optimization to have an impact on a language change discussion.
Just as a reminder, my concern is that people reuse exceptions (rarely) and that the behavior of the "with_exceptions()" method is ambiguous when that happens. It has nothing to do with optimization.
The two solutions of: 1. always replace an existing __traceback__ 2. never replace an existing __traceback__ both seem to lead to problems.
I may be strange, or in left field, or both. Since the traceback is the object that is always created, it would seem natural to me that the traceback have a reference to the exception and not the other way around. It would also seem to be a good place to attach a nested traceback which intern has it's own reference to its exception. I never really thought about it when they were just peer objects traveling up the stack. Just an idea from a different seat ;) -Shane Holloway

On 2/27/07, glyph@divmod.com <glyph@divmod.com> wrote:
On Tue, 27 Feb 2007 13:37:21 +1300, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I don't like that answer. I can think of legitimate reasons for wanting to pre-create exceptions, e.g. if I'm intending to raise and catch a particular exception frequently and I don't want the overhead of creating a new instance each time.
This seems like kind of a strange micro-optimization to have an impact on a language change discussion. Wouldn't it be better just to optimize instance creation overhead? Or modify __new__ on your particular heavily-optimized exception to have a free-list, so it can be both correct (you can still mutate exceptions) and efficient (you'll only get a new exception object if you really need it).
It sounds like we should always copy the exception given to raise, and that not doing so is an optimization (albeit a commonly hit one). Not arguing for or against, just making an observation. On second thought, we could check that the refcount is 1 and avoid copying in the common case of "raise Foo()". Is reraising common enough that we need to optimize it? -- Adam Olsen, aka Rhamphoryncus

On 2/28/07, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Adam Olsen wrote:
It sounds like we should always copy the exception given to raise,
I don't like that either, for all the reasons that make it infeasible to copy an arbitrary object in a general way.
Exceptions aren't arbitrary objects though. The requirement that they inherit from BaseException is specifically to create a common interface. Copying would be an extension of that interface. I believe calling copy.copy() would be sufficient. -- Adam Olsen, aka Rhamphoryncus

On Wed, 28 Feb 2007 18:29:11 -0700, Adam Olsen <rhamph@gmail.com> wrote:
On 2/28/07, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Adam Olsen wrote:
It sounds like we should always copy the exception given to raise,
I don't like that either, for all the reasons that make it infeasible to copy an arbitrary object in a general way.
Exceptions aren't arbitrary objects though. The requirement that they inherit from BaseException is specifically to create a common interface. Copying would be an extension of that interface.
I believe calling copy.copy() would be sufficient.
Does copying every exception given to `raise' solve the problem being discussed here? Consider the current Python behavior: no copying is performed, most code instantiates a new exception instance for each raise statement, some code creates a single exception and re-raises it repeatedly. And the new behavior? Every raise statement copies an exception instance, some code will create a new exception instance for each raise statement, some code will create a single exception and re-raise it repeatedly. That doesn't sound like an improvement to me. Normal code will be more wasteful. Code which the author has gone out of his way to tune will be as wasteful as /average/ code currently is, and more wasteful than tuned code now is. Plus you now have the added mental burden of keeping track of which objects are copies of what (and if you throw in the refcount=1 optimization, then this burden is increased - was something accidentally relying on copying or non-copying? Did a debugger grab a reference to the exception object, thus changing the programs behavior? Did a third-party hang on to an exception for longer than the raising code expected? etc). Jean-Paul

I am beginning to think that there are serious problems with attaching the traceback to the exception; I really don't like the answer that pre-creating an exception is unpythonic in Py3k. On 2/28/07, Jean-Paul Calderone <exarkun@divmod.com> wrote:
On Wed, 28 Feb 2007 18:29:11 -0700, Adam Olsen <rhamph@gmail.com> wrote:
On 2/28/07, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Adam Olsen wrote:
It sounds like we should always copy the exception given to raise,
I don't like that either, for all the reasons that make it infeasible to copy an arbitrary object in a general way.
Exceptions aren't arbitrary objects though. The requirement that they inherit from BaseException is specifically to create a common interface. Copying would be an extension of that interface.
I believe calling copy.copy() would be sufficient.
Does copying every exception given to `raise' solve the problem being discussed here?
Consider the current Python behavior: no copying is performed, most code instantiates a new exception instance for each raise statement, some code creates a single exception and re-raises it repeatedly.
And the new behavior? Every raise statement copies an exception instance, some code will create a new exception instance for each raise statement, some code will create a single exception and re-raise it repeatedly.
That doesn't sound like an improvement to me. Normal code will be more wasteful. Code which the author has gone out of his way to tune will be as wasteful as /average/ code currently is, and more wasteful than tuned code now is.
Plus you now have the added mental burden of keeping track of which objects are copies of what (and if you throw in the refcount=1 optimization, then this burden is increased - was something accidentally relying on copying or non-copying? Did a debugger grab a reference to the exception object, thus changing the programs behavior? Did a third-party hang on to an exception for longer than the raising code expected? etc).
Jean-Paul _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
-- --Guido van Rossum (home page: http://www.python.org/~guido/)

On 2/28/07, Guido van Rossum <guido@python.org> wrote:
I am beginning to think that there are serious problems with attaching the traceback to the exception; I really don't like the answer that pre-creating an exception is unpythonic in Py3k.
How plausible would it be to optimize all exception instantiation? Perhaps use slots and a freelist for everything inheriting from BaseException and not inheriting from other builtin types?
On 2/28/07, Jean-Paul Calderone <exarkun@divmod.com> wrote:
On Wed, 28 Feb 2007 18:29:11 -0700, Adam Olsen <rhamph@gmail.com> wrote:
I believe calling copy.copy() would be sufficient.
That doesn't sound like an improvement to me. Normal code will be more wasteful. Code which the author has gone out of his way to tune will be as wasteful as /average/ code currently is, and more wasteful than tuned code now is.
-- Adam Olsen, aka Rhamphoryncus

Adam Olsen wrote:
How plausible would it be to optimize all exception instantiation? Perhaps use slots and a freelist for everything inheriting from BaseException and not inheriting from other builtin types?
I'm not sure a free list would help much for instances of user define classes, since creating one involves setting up a dict, etc. And if you use __slots__ you end up with objects of different sizes, which isn't free-list-friendly. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing@canterbury.ac.nz +--------------------------------------+

On 2/28/07, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Adam Olsen wrote:
How plausible would it be to optimize all exception instantiation? Perhaps use slots and a freelist for everything inheriting from BaseException and not inheriting from other builtin types?
I'm not sure a free list would help much for instances of user define classes, since creating one involves setting up a dict, etc. And if you use __slots__ you end up with objects of different sizes, which isn't free-list-friendly.
Not easy, but doable. Perhaps a plan B if nobody comes up with a plan A. -- Adam Olsen, aka Rhamphoryncus

On Feb 28, 2007, at 9:10 PM, Guido van Rossum wrote:
I am beginning to think that there are serious problems with attaching the traceback to the exception; I really don't like the answer that pre-creating an exception is unpythonic in Py3k.
I'll say up front that I haven't been paying as much attention to the topic of exception behavior as perhaps I should before attempting to contribute to a thread about it...but... It seems to me that a stack trace should always be attached to an exception object at creation time of the exception, and never at any other time. Then, if someone pre-creates an exception object, they get consistent and easily explainable behavior (the traceback to the creation point). The traceback won't necessarily be *useful*, but presumably someone pre-creating an exception object did so to save run-time, and creating the traceback is generally very expensive, so doing that only once, too, seems like a win to me. FWIW, that's basically how exceptions work in Java. From http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Throwable.html:
Instances of two subclasses, Error and Exception, are conventionally used to indicate that exceptional situations have occurred. Typically, these instances are freshly created in the context of the exceptional situation so as to include relevant information (such as stack trace data).
A throwable contains a snapshot of the execution stack of its thread at the time it was created. It can also contain a message string that gives more information about the error. Finally, it can contain a cause: another throwable that caused this throwable to get thrown. The cause facility is new in release 1.4. It is also known as the chained exception facility, as the cause can, itself, have a cause, and so on, leading to a "chain" of exceptions, each caused by another.
There's probably a million reasons why this doesn't work for python, but they don't immediately jump out at me. :) Migration from 2.X to 3.X would consist of recommending not to create an exception outside of a raise line, unless you're okay with the traceback location changing from the raise point to the creation point. James

On 2/28/07, James Y Knight <foom@fuhm.net> wrote:
It seems to me that a stack trace should always be attached to an exception object at creation time of the exception, and never at any other time. Then, if someone pre-creates an exception object, they get consistent and easily explainable behavior (the traceback to the creation point). The traceback won't necessarily be *useful*, but presumably someone pre-creating an exception object did so to save run-time, and creating the traceback is generally very expensive, so doing that only once, too, seems like a win to me.
The only example I found in about 2 dozen packages where the exception was precreated was in pyparsing. I don't know the reason why it was done that way, but I'm certain it wasn't for performance. The exception is created as part of the format definition. In that case if the traceback is important then it's important to know which code was attempting the parse. The format definition was probably certainly done at module import time. In any case, reraising the same exception instance is a rare construct in current Python code. PJE had never seen it before. It's hard to get a good intuition from zero data points. :)

Andrew Dalke wrote:
On 2/28/07, James Y Knight <foom@fuhm.net> wrote:
It seems to me that a stack trace should always be attached to an exception object at creation time of the exception, and never at any other time. Sounds good in principle - but don't forget that normally the exception will be instantiated and *then* passed to the raise statement.
I've never seen a module level exception instance before. With the proposed changes, modules that do this would *continue* to work, surely ? So they lose nothing (compared to the current situation) by having the traceback information overwritten, they just can't take direct advantage of the new attribute. Michael Foord

Michael Foord wrote:
With the proposed changes, modules that do this would *continue* to work, surely ?
Probably, but it might mean they were no longer thread safe. An exception caught and raised in one thread would be vulnerable to having its traceback clobbered by another thread raising the same instance. There's also the possibility of a traceback unexpectedly kept alive causing GC problems -- cycles, files not closed when you expect, etc. -- Greg

Greg Ewing wrote:
Michael Foord wrote:
With the proposed changes, modules that do this would *continue* to work, surely ?
Probably, but it might mean they were no longer thread safe. An exception caught and raised in one thread would be vulnerable to having its traceback clobbered by another thread raising the same instance.
Right - but that would still be *no worse* than the current situation where that information isn't available on the instance. The current patterns would continue to work unchanged, but the new information wouldn't be available because a single instance is being reused.
There's also the possibility of a traceback unexpectedly kept alive causing GC problems -- cycles, files not closed when you expect, etc.
That *could* be a problem, although explicitly closing files is always a good practise :-) Michael Foord
-- Greg _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.u...

Greg Ewing wrote:
Michael Foord wrote:
With the proposed changes, modules that do this would *continue* to work, surely ?
Probably, but it might mean they were no longer thread safe. An exception caught and raised in one thread would be vulnerable to having its traceback clobbered by another thread raising the same instance.
Right - but that would still be *no worse* than the current situation where that information isn't available on the instance. The current patterns would continue to work unchanged, but the new information wouldn't be available because a single instance is being reused.
There's also the possibility of a traceback unexpectedly kept alive causing GC problems -- cycles, files not closed when you expect, etc.
That *could* be a problem, although explicitly closing files is always a good practise :-) Michael Foord
-- Greg _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.u...

Michael Foord wrote:
Greg Ewing wrote:
An exception caught and raised in one thread would be vulnerable to having its traceback clobbered by another thread raising the same instance.
Right - but that would still be *no worse* than the current situation where that information isn't available on the instance.
Um -- yes, it would, because currently you don't *expect* the traceback to be available from the exception. If that became the standard way to handle tracebacks, then you would expect it to work reliably. -- Greg

Greg Ewing wrote:
Michael Foord wrote:
Greg Ewing wrote:
An exception caught and raised in one thread would be vulnerable to having its traceback clobbered by another thread raising the same instance.
Right - but that would still be *no worse* than the current situation where that information isn't available on the instance.
Um -- yes, it would, because currently you don't *expect* the traceback to be available from the exception. If that became the standard way to handle tracebacks, then you would expect it to work reliably.
Um... except that the new attributes *obviously* means that the traceback information is obviously not going to work where you reuse a single instance and to expect otherwise seems naive. If the new pattern *doesn't* break existing code, but means that using a single instance for optimisation (the only justification put forward - re-raising being a slightly different case) makes that information unreliable; then I don't see that as a reason to reject the change. Michael Foord
-- Greg _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.u...

Michael Foord wrote:
Um... except that the new attributes *obviously* means that the traceback information is obviously not going to work where you reuse a single instance and to expect otherwise seems naive.
Yes, but this means that the __traceback__ attribute couldn't be *the* way of handling tracebacks in Py3k. The old way would still have to be there, and the new way could only ever be a convenience feature that might not be available and might not work properly if it is. That doesn't seem like a tidy situation to me. Py3k is supposed to me making things cleaner, not more messy. -- Greg

James Y Knight wrote:
It seems to me that a stack trace should always be attached to an exception object at creation time
Um. Yes. Well, that's certainly an innovative solution...
The traceback won't necessarily be *useful*,
Almost completely use*less*, I would have thought. The traceback is mostly used to find out where something went wrong, not where it went right (i.e. successful creation of the exception).
creating the traceback is generally very expensive,
I don't think so -- isn't it just a linked list of existing stack frames? That should be very cheap to create.
From http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Throwable.html:
A throwable contains a snapshot of the execution stack of its thread at the time it was created.
This would be a major and surprising change to Python users. It would also be considerably *more* expensive to implement than the current scheme, because it would require copying the entire stack, instead of just linking stack frames together as they are unwound during the search for an exception handler. -- Greg

On Mar 1, 2007, at 3:27 AM, Greg Ewing wrote:
James Y Knight wrote:
The traceback won't necessarily be *useful*,
Almost completely use*less*, I would have thought. The traceback is mostly used to find out where something went wrong, not where it went right (i.e. successful creation of the exception).
The advantages are that it's an easily understandable and explainable behavior, and the traceback points you (the programmer) to the exact location where you went wrong: creating the exception at module level. Creating an exception with a non-exceptional stacktrace isn't always useless: sometimes you have exceptions where you know you never care about the stacktrace (internal flow control/etc).
This would be a major and surprising change to Python users.
It's only a major change if you don't raise the exception in the same place you create it. (which other people are claiming is extremely rare).
It would also be considerably *more* expensive to implement than the current scheme, because it would require copying the entire stack, instead of just linking stack frames together as they are unwound during the search for an exception handler.
Yes of course, you're right, I withdraw the proposal. I had forgotten that python doesn't currently save the entire stack, only that between the 'raise' and the 'except'. (I had forgotten, because Twisted's "Failure" objects do save and print the entire stacktrace, both above and below the catch-point). James

[Summary: James Knight's idea can't work unless we copy the entire stack, which is bad. Disregard my previous posts in this thread of a few minutes ago. See the end of this post why.] On 3/1/07, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
James Y Knight wrote:
It seems to me that a stack trace should always be attached to an exception object at creation time
Um. Yes. Well, that's certainly an innovative solution...
In the Python context perhaps, but given the Java precedent I would hardly call it innovative.
The traceback won't necessarily be *useful*,
Almost completely use*less*, I would have thought. The traceback is mostly used to find out where something went wrong, not where it went right (i.e. successful creation of the exception).
Which is one opcode before it is raised, in 99.99% of all cases.
creating the traceback is generally very expensive,
I don't think so -- isn't it just a linked list of existing stack frames? That should be very cheap to create.
This is a difference between Python and Java that we should preserve. Java's traceback is a string; Python's is a linked list of traceback objects
From http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Throwable.html:
A throwable contains a snapshot of the execution stack of its thread at the time it was created.
This would be a major and surprising change to Python users.
It would also be considerably *more* expensive to implement than the current scheme, because it would require copying the entire stack, instead of just linking stack frames together as they are unwound during the search for an exception handler.
Oh bah, you're right. This sounds like a deal killer c.q. show shopper. In my earlier responses forgot the details of how Python exceptions work. You start with a traceback object pointing to the current frame object (traceback objects are distinct from frame objects, they are linked in the *opposite* direction, so no cycles are created); then each time the exception bubbles up a stack frame, a new traceback object pointing to the next frame object is inserted in front of the traceback. This requires updating the traceback pointer each time we bubble up a frame. Then when you catch the exception, the chain of tracebacks points to all frames between the catching and the raising frame (I forget whether catching frame is included). Since this can conceivably be going on in parallel in multiple threads, we really don't ever want to be sharing whatever object contains the head of the chain of tracebacks since it mutates at every frame bubble-up. I guess our next best option would be Glyph's suggested object to represent a caught exception, which could indeed be named "catch" per Greg's suggestion. The next-best option would be to clone the exception object whenever it is raised, but that seems wasteful in the normal case. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
You start with a traceback object pointing to the current frame object (traceback objects are distinct from frame objects,
Just out of curiosity, is it really necessary to have a distinct traceback object? Couldn't the traceback just be made of dead frame objects linked the opposite way through their f_next pointers? Seems to me it would be advantageous if raising an exception (once it's created) could be done without having to allocate any memory. Otherwise you could get the situation where you're trying to raise a MemoryError, but there's no memory to allocate a traceback object, so you raise a MemoryError... etc... That might be a reason for pre-allocating the MemoryError exception, too. -- Greg

On 3/1/07, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Guido van Rossum wrote:
You start with a traceback object pointing to the current frame object (traceback objects are distinct from frame objects,
Just out of curiosity, is it really necessary to have a distinct traceback object? Couldn't the traceback just be made of dead frame objects linked the opposite way through their f_next pointers?
That would be rather fragile; there's tons of code that follows f_next pointers, without regard for whether the frame is alive or dead. Using a separate pointer would be a possibility, but it would probably still mean setting the f_next pointers to NULL to avoid hard cycles. Maybe this idea could be used for a new VM design though.
Seems to me it would be advantageous if raising an exception (once it's created) could be done without having to allocate any memory. Otherwise you could get the situation where you're trying to raise a MemoryError, but there's no memory to allocate a traceback object, so you raise a MemoryError... etc...
That might be a reason for pre-allocating the MemoryError exception, too.
I think it is preallocated. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

On 2/28/07, James Y Knight <foom@fuhm.net> wrote:
On Feb 28, 2007, at 9:10 PM, Guido van Rossum wrote:
I am beginning to think that there are serious problems with attaching the traceback to the exception; I really don't like the answer that pre-creating an exception is unpythonic in Py3k.
I'll say up front that I haven't been paying as much attention to the topic of exception behavior as perhaps I should before attempting to contribute to a thread about it...but...
It seems to me that a stack trace should always be attached to an exception object at creation time of the exception, and never at any other time. Then, if someone pre-creates an exception object, they get consistent and easily explainable behavior (the traceback to the creation point). The traceback won't necessarily be *useful*, but presumably someone pre-creating an exception object did so to save run-time, and creating the traceback is generally very expensive, so doing that only once, too, seems like a win to me.
I agree. Since by far the most common use case is to create the exception in the raise statement, the behavior there won't be any different than it is today; the traceback on precreated objects will be useless, but folks who precreate them for performance reasons presumably won't care; and those that create global exception instances by mistakenly copying the wrong idiom, well, they'll learn quickly (and a lot more quickly than when we try to override the exception).
FWIW, that's basically how exceptions work in Java. From http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Throwable.html:
Instances of two subclasses, Error and Exception, are conventionally used to indicate that exceptional situations have occurred. Typically, these instances are freshly created in the context of the exceptional situation so as to include relevant information (such as stack trace data).
A throwable contains a snapshot of the execution stack of its thread at the time it was created. It can also contain a message string that gives more information about the error. Finally, it can contain a cause: another throwable that caused this throwable to get thrown. The cause facility is new in release 1.4. It is also known as the chained exception facility, as the cause can, itself, have a cause, and so on, leading to a "chain" of exceptions, each caused by another.
There's probably a million reasons why this doesn't work for python, but they don't immediately jump out at me. :)
Not at me either. Java exceptions weren't around when Python's exceptions were first invented!
Migration from 2.X to 3.X would consist of recommending not to create an exception outside of a raise line, unless you're okay with the traceback location changing from the raise point to the creation point.
Sounds fine with me. (But I haven't digested Glyph's response.) -- --Guido van Rossum (home page: http://www.python.org/~guido/)

On 3/1/07, Guido van Rossum <guido@python.org> wrote:
Since by far the most common use case is to create the exception in the raise statement, the behavior there won't be any different than it is today; the traceback on precreated objects will be useless, but folks who precreate them for performance reasons presumably won't care; and those that create global exception instances by mistakenly copying the wrong idiom, well, they'll learn quickly (and a lot more quickly than when we try to override the exception).
Here's a few more examples of code which don't follow the idiom raise ExceptionClass(args) Zope's ZConfig/cmdline.py def addOption(self, spec, pos=None): if pos is None: pos = "<command-line option>", -1, -1 if "=" not in spec: e = ZConfig.ConfigurationSyntaxError( "invalid configuration specifier", *pos) e.specifier = spec raise e The current xml.sax.handler.Error handler includes def error(self, exception): "Handle a recoverable error." raise exception def fatalError(self, exception): "Handle a non-recoverable error." raise exception and is used like this in xml.sax.expatreader.ExpatParser.feed try: # The isFinal parameter is internal to the expat reader. # If it is set to true, expat will check validity of the entire # document. When feeding chunks, they are not normally final - # except when invoked from close. self._parser.Parse(data, isFinal) except expat.error, e: exc = SAXParseException(expat.ErrorString(e.code), e, self) # FIXME: when to invoke error()? self._err_handler.fatalError(exc) Note that the handler may decide to ignore the exception, based on which error occured. The traceback should show where in the handler the exception was raised, and not the point at which the exception was created. ZODB/Connection.py: ... if isinstance(store_return, str): assert oid is not None self._handle_one_serial(oid, store_return, change) else: for oid, serial in store_return: self._handle_one_serial(oid, serial, change) def _handle_one_serial(self, oid, serial, change): if not isinstance(serial, str): raise serial ... Andrew dalke@dalkescientific.com

On Wed, 28 Feb 2007 18:10:21 -0800, Guido van Rossum <guido@python.org> wrote:
I am beginning to think that there are serious problems with attaching the traceback to the exception; I really don't like the answer that pre-creating an exception is unpythonic in Py3k.
In Twisted, to deal with asynchronous exceptions, we needed an object to specifically represent a "raised exception", i.e. an Exception instance with its attached traceback and methods to manipulate it. You can find its API here: http://twistedmatrix.com/documents/current/api/twisted.python.failure.Failur... Perhaps the use-cases for attaching the traceback object to the exception would be better satisfied by simply having sys.exc_info() return an object with methods like Failure? Reading the "motivation" section of PEP 344, it describes "passing these three things in parallel" as "tedious and error-prone". Having one object one could call methods on instead of a 3-tuple which needed to be selectively passed on would simplify things. For example, chaining could be accomplished by doing something like this: sys.current_exc_thingy().chain() I can't think of a good name for the new object type, since "traceback", "error", "exception" and "stack" all already mean things in Python.

glyph@divmod.com wrote:
Perhaps the use-cases for attaching the traceback object to the exception would be better satisfied by simply having sys.exc_info() return an object with methods like Failure?
I can't think of a good name for the new object type,
Maybe we could call it a 'catch' (used as a noun, as when a fisherman might say "That's a good catch!") -- Greg

On 2/28/07, glyph@divmod.com <glyph@divmod.com> wrote:
On Wed, 28 Feb 2007 18:10:21 -0800, Guido van Rossum <guido@python.org> wrote:
I am beginning to think that there are serious problems with attaching the traceback to the exception; I really don't like the answer that pre-creating an exception is unpythonic in Py3k.
In Twisted, to deal with asynchronous exceptions, we needed an object to specifically represent a "raised exception", i.e. an Exception instance with its attached traceback and methods to manipulate it.
You can find its API here:
http://twistedmatrix.com/documents/current/api/twisted.python.failure.Failur...
Perhaps the use-cases for attaching the traceback object to the exception would be better satisfied by simply having sys.exc_info() return an object with methods like Failure? Reading the "motivation" section of PEP 344, it describes "passing these three things in parallel" as "tedious and error-prone". Having one object one could call methods on instead of a 3-tuple which needed to be selectively passed on would simplify things.
For example, chaining could be accomplished by doing something like this:
sys.current_exc_thingy().chain()
I can't think of a good name for the new object type, since "traceback", "error", "exception" and "stack" all already mean things in Python.
I'm guessing you didn't see James Knight's proposal. If we can agree on more Java-esque exception semantics, the exception object could serve this purpose just fine. I'm thinking that in that case an explicit with_traceback(tb) should perhaps clone the exception; the clone could be fairly simple by constructing a new uninitialized instance (the way unpickling does) and filling its dict with a copy of the original's dict, overwriting the traceback. (Also, if Brett's exception reform is accepted, we should call this attribute just traceback, not __traceback__.) -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Jean-Paul Calderone wrote:
And the new behavior? Every raise statement copies an exception instance, some code will create a new exception instance for each raise statement, some code will create a single exception and re-raise it repeatedly.
Make that "most code will create a new exception instance and then make a copy of it", unless this can be optimised away somehow, and it's not necessarily obvious that the refcount == 1 trick will work (it depends on the exact details of how the references flow through the exception raising machinery). -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing@canterbury.ac.nz +--------------------------------------+

Adam Olsen wrote:
Exceptions aren't arbitrary objects though. The requirement that they inherit from BaseException is specifically to create a common interface.
But that doesn't tell you enough. If the exception references some other object, should you copy it? You can't tell just from the fact that it inherits from BaseException. Besides, making a copy of the exception seems just as expensive as creating a new instance, so it does nothing to address the efficiency issue. Maybe it's not as important as I feel it is, but I like the way that exception raising is lightweight enough to use for flow control. When used that way, creating a new instance each time seems wasteful. I accept the overhead because I know that if it were ever a problem I could eliminate it by pre-creating the instance. I'd be disappointed to lose that ability.
I believe calling copy.copy() would be sufficient.
I never use that, because I have no confidence that it would DWIM. I'd be unhappy if the system started relying on it anywhere fundamental. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing@canterbury.ac.nz +--------------------------------------+

glyph@divmod.com wrote:
Or modify __new__ on your particular heavily-optimized exception to have a free-list,
Doing that in Python is likely to have as much overhead as creating an instance. The simple and obvious optimisation is to pre-create the instance, but we're proposing to make the obvious way wrong for subtle reasons. That doesn't seem pythonic to me. -- Greg

On 2/26/07, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Phillip J. Eby wrote:
At 03:38 PM 2/26/2007 -0700, Andrew Dalke wrote:
NO_END_OF_RECORD = ParserError("Cannot find end of record")
Then don't do that, as it's bad style for Python 3.x. ;-)
I don't like that answer. I can think of legitimate reasons for wanting to pre-create exceptions, e.g. if I'm intending to raise and catch a particular exception frequently and I don't want the overhead of creating a new instance each time.
Is this really the problem it's being made out to be? I'm guessing the use-case you're suggesting is where certain exceptions are raised and caught inside a library or application, places where the exceptions will never reach the user. If that's the case, does it really matter what the traceback looks like?
For me, this is casting serious doubt on the whole idea of attaching the traceback to the exception.
If attaching the traceback to the exception is bothering you, you should take a look at the other attributes PEP 344 introduces: __cause__ and __context__. I'd say what needs another look is the idea of pre-creating a single exception instance and repeatedly raising it. Collin Winter

PJE:
Then don't do that, as it's bad style for Python 3.x. ;-)
It's bad style for 3.x only if Python goes with this interface. If it stays with the 2.x style then there's no problem. There may also be solutions which are cleaner and which don't mutate the exception instance. I am not proposing such a syntax. I have ideas I am not a language designer and have long given up the idea that I might be good at it.
This does mean you won't be able to port your code to 3.x style until you've gotten rid of shared exception instances from all your dependencies, but 3.x porting requires all your dependencies to be ported anyway.
What can be done to minimize the number of dependencies which need to be changed?
It should be sufficient in both 2.x and 3.x for with_traceback() to raise an error if the exception already has a traceback -- this should catch any exception instance reuse.
That would cause a problem in my example where I save then reraise the exception, as raise saved_err.with_traceback(saved_err.__traceback__)
What is the correct way to rewrite this for use with "with_traceback"? Is it [...]
No, it's more like this:
try: for dirname in ... try: return ... except Exception as err: saved_err = err raise saved_err finally: del saved_err
I don't get it. The "saved_err" has a __traceback__ attached to it, and is reraised. Hence it gets the old stack, right? Suppose I wrote ERR = Exception("Do not do that") try: f(x) except Exception: raise ERR try: f(x*2) except Exception: raise ERR Yes it's bad style, but people will write it. The ERR gets the traceback from the first time there's an error, and that traceback is locked in ... since raise won't change the __traceback__ if one exists. (Based on what you said it does.)
I've added the outer try-finally block to minimize the GC impact of the *original* code you showed, as the `saved_tb` would otherwise have created a cycle. That is, the addition is not because of the porting, it's just something that you should've had to start with.
Like I said, I used code based on os._execvpe. Here's the code saved_exc = None saved_tb = None for dir in PATH: fullname = path.join(dir, file) try: func(fullname, *argrest) except error, e: tb = sys.exc_info()[2] if (e.errno != ENOENT and e.errno != ENOTDIR and saved_exc is None): saved_exc = e saved_tb = tb if saved_exc: raise error, saved_exc, saved_tb raise error, e, tb I see similar use in atexit._run_exitfuncs, though as Python is about to exit it won't make a real difference. doctest shows code like >>> exc_info = failure.exc_info >>> raise exc_info[0], exc_info[1], exc_info[2] SimpleXMLRPCServer does things like except: # report exception back to server exc_type, exc_value, exc_tb = sys.exc_info() response = xmlrpclib.dumps( xmlrpclib.Fault(1, "%s:%s" % (exc_type, exc_value)), encoding=self.encoding, allow_none=self.allow_none, ) I see threading.py gets it correctly. My point here is that most Python code which uses the traceback term doesn't break the cycle, so must be caught by the gc. While there might be a more correct way to do it, it's too complicated for most to get it right.
Anyway, the point here is that in 3.x style, most uses of 3-argument raise just disappear altogether. If you hold on to an exception instance, you have to be careful about it for GC, but no more so than in current Python.
Where people already make a lot of mistakes. But my concern is not in the gc, it's in the mutability of the exception causing hard to track down problems in code which is written by beginning to intermediate users.
The "save one instance and use it forever" use case is new to me - I've never seen nor written code that uses it before now. It's definitely incompatible with 3.x style, though.
I pointed out an example in pyparsing. Thomas W. says he's seen other code. I've been looking for another real example but as this is relatively uncommon code, I don't have a wide enough corpus for the search. I also don't know of a good tool for searching for this sort of thing. (Eg, www.koders.com doesn't help.) It's a low probability occurance. So is the use of the 3 arg raise. Hence it's hard to get good intuition about problems which might arise. Andrew dalke@dalkescientific.com
participants (12)
-
Adam Olsen
-
Andrew Dalke
-
Collin Winter
-
glyph@divmod.com
-
Greg Ewing
-
Guido van Rossum
-
James Y Knight
-
Jean-Paul Calderone
-
Michael Foord
-
Michael Foord
-
Phillip J. Eby
-
Shane Holloway