[Twisted-Python] how to write a safe catch-all
Hi All, The attached .py file demonstrates a problem I'm having with a little scheduler I need to maintain. The loop() function is supposed to be a "catch all and report" error handler. However, when the async code initiated by doStuff throws the AttributeError, what actually gets logged is: 2010-09-28 14:41:15,706 ERROR : log (14757|7f691ba676e0): Unhandled Error Traceback (most recent call last): File "test_looping.py", line 41, in <module> reactor.run() File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/internet/base.py", line 1128, in run self.mainLoop() File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/internet/base.py", line 1137, in mainLoop self.runUntilCurrent() --- <exception caught here> --- File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/internet/base.py", line 757, in runUntilCurrent call.func(*call.args, **call.kw) File "test_looping.py", line 24, in __call__ del self.connector exceptions.AttributeError: Break instance has no attribute 'connector' 2010-09-28 14:41:15,707 ERROR : log (14757|7f691ba676e0): Unhandled scheduled exception Traceback (most recent call last): File "test_looping.py", line 41, in <module> reactor.run() File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/internet/base.py", line 1128, in run self.mainLoop() File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/internet/base.py", line 1137, in mainLoop self.runUntilCurrent() --- <exception caught here> --- File "test_looping.py", line 35, in loop yield doStuff() exceptions.GeneratorExit: But, more crucially, the looping call then appears to stop. What can I do to get the exception logged and then everything handled sanely such that the looping call can continue and my lopp function will keep getting called once every second rather than stopping? cheers, Chris
On 01:48 pm, chris@simplistix.co.uk wrote:
Hi All,
The attached .py file demonstrates a problem I'm having with a little scheduler I need to maintain.
The loop() function is supposed to be a "catch all and report" error handler. However, when the async code initiated by doStuff throws the AttributeError, what actually gets logged is:
2010-09-28 14:41:15,706 ERROR : log (14757|7f691ba676e0): Unhandled Error File "test_looping.py", line 24, in __call__ del self.connector exceptions.AttributeError: Break instance has no attribute 'connector'
2010-09-28 14:41:15,707 ERROR : log (14757|7f691ba676e0): Unhandled scheduled exception File "test_looping.py", line 35, in loop yield doStuff() exceptions.GeneratorExit:
But, more crucially, the looping call then appears to stop.
The function you're looping over returns a Deferred that never fires. The LoopingCall isn't stopped, it's waiting for the Deferred.
What can I do to get the exception logged and then everything handled sanely such that the looping call can continue and my lopp function will keep getting called once every second rather than stopping?
When you do reactor.callLater(n, f), you put f into an execution context where the only thing that will happen to exceptions it raises is that they will be logged. If you want something else to happen to the exception, then you have to arrange for that. You can do this by wrapping f with another function that handles the exception and sends it where you want. Or you can use twisted.internet.task.deferLater, which probably does something pretty close to what you want. Jean-Paul
On 28/09/2010 15:21, exarkun@twistedmatrix.com wrote:
But, more crucially, the looping call then appears to stop.
The function you're looping over returns a Deferred that never fires. The LoopingCall isn't stopped, it's waiting for the Deferred.
So, inlineCallbacks/generator things will only process errbacks, not actual exceptions raised inside asyncronous code called from them?!
What can I do to get the exception logged and then everything handled sanely such that the looping call can continue and my lopp function will keep getting called once every second rather than stopping?
When you do reactor.callLater(n, f), you put f into an execution context where the only thing that will happen to exceptions it raises is that they will be logged.
Okay, the script was a simplification of the "real problem" to try and give the list a "smallest failing example to try and help with". The real logging looks like this: 2010-09-27 15:30:16,340 ERROR : log (24331|7f2e47b4d6e0): Unhandled exception sending schedule transmission Traceback (most recent call last): File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/python/context.py", line 37, in callWithContext return func(*args,**kw) File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/internet/selectreactor.py", line 146, in _doReadOrWrite why = getattr(selectable, method)() File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/internet/tcp.py", line 631, in doConnect self.failIfNotConnected(error.getConnectError((err, strerror(err)))) File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/internet/tcp.py", line 588, in failIfNotConnected del self.connector --- <exception caught here> --- File "ourcode.py", line 180, in checkSchedule yield self.sendTransmissions(...) exceptions.GeneratorExit: 2010-09-27 15:30:28,428 ERROR : log (24331|7f2e47b4d6e0): Unhandled error in Deferred: 2010-09-27 15:30:28,584 ERROR : log (24331|7f2e47b4d6e0): Unhandled Error Traceback (most recent call last): Failure: twisted.protocols.ftp.FTPError: ('Connection Failed', <twisted.python.failure.Failure <class 'twisted.internet.error.ConnectError'>>) I don't quite follow what the above is trying to tell me, other than an FTP connection failed. However, I don't understand why that results in a GeneratorExit rather than an errback of the original exception being caught by the top level handler in the loop() function (switching back to the example terminology for simplicity). I also don't understand why an unhandled deferred is being logged rather than fed back into the handler I built for it!
exception, then you have to arrange for that. You can do this by wrapping f with another function that handles the exception and sends it where you want.
Well, as far as I can tell, that's what I'm trying to do. However, the thing failing in the real world is in code I don't "own" (looks like twisted's inards...) and I'd like to be able to cater for any failure, unforseen or not (barring say a SyntaxError ;-)) and still have the loop() call keep doing its thing. How can I do that? Chris
On 29/09/10 16:56, Chris Withers wrote:
On 28/09/2010 15:21, exarkun@twistedmatrix.com wrote:
But, more crucially, the looping call then appears to stop.
The function you're looping over returns a Deferred that never fires. The LoopingCall isn't stopped, it's waiting for the Deferred.
So, inlineCallbacks/generator things will only process errbacks, not actual exceptions raised inside asyncronous code called from them?!
No. The problem is that your example is malformed. You do this: 1. Create a deferred on the "Break" class instance 2. Return it from doStuff 3. You "callLater" doesn't touch that deferred, so the generator/doStuff call a) never calls back and b) discards any errors that happen inside callLater. You want something like this: class Break: def __init__(self): self.deferred = Deferred() def __call__(self): try: del self.connector except: self.deferred.errback() else: self.deferred.callback(1)
On 29/09/2010 17:16, Phil Mayers wrote:
No.
The problem is that your example is malformed.
Well, it's not, it's the reality of the situation and one I'd like to protect against; "the scheduler must not die" is the rule I need to make work...
You do this:
1. Create a deferred on the "Break" class instance
The "Break" class, in reality, is the bowels of twisted.protocols.ftp.FTPClient, copying from my previous mail: self.failIfNotConnected(error.getConnectError((err, strerror(err)))) File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/internet/tcp.py", line 588, in failIfNotConnected del self.connector --- <exception caught here> --- File "ourcode.py", line 180, in checkSchedule yield self.sendTransmissions(...) exceptions.GeneratorExit: How can I protect my scheduler against code which doesn't catch an exception when it should? cheers, Chris
On 05:16 pm, chris@simplistix.co.uk wrote:
On 29/09/2010 17:16, Phil Mayers wrote:
No.
The problem is that your example is malformed.
Well, it's not, it's the reality of the situation and one I'd like to protect against; "the scheduler must not die" is the rule I need to make work...
You do this:
1. Create a deferred on the "Break" class instance
The "Break" class, in reality, is the bowels of twisted.protocols.ftp.FTPClient, copying from my previous mail:
self.failIfNotConnected(error.getConnectError((err, strerror(err)))) File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/internet/tcp.py", line 588, in failIfNotConnected del self.connector --- <exception caught here> --- File "ourcode.py", line 180, in checkSchedule yield self.sendTransmissions(...) exceptions.GeneratorExit:
How can I protect my scheduler against code which doesn't catch an exception when it should?
Then you're talking about an API in Twisted which returns a Deferred that sometimes doesn't errback when the implementation encounters an error. Also, `failIfNotConnected` should never raise an exception. These sound like bugs. File a couple tickets. With a unit tests please. :) Jean-Paul
On 29/09/2010 18:31, exarkun@twistedmatrix.com wrote:
Then you're talking about an API in Twisted which returns a Deferred that sometimes doesn't errback when the implementation encounters an error.
Also, `failIfNotConnected` should never raise an exception.
These sound like bugs.
File a couple tickets. With a unit tests please. :)
That's one side of things, sure, but how can I write a scheduler which handles the current situation? Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk
On 08:27 pm, chris@simplistix.co.uk wrote:
On 29/09/2010 18:31, exarkun@twistedmatrix.com wrote:
Then you're talking about an API in Twisted which returns a Deferred that sometimes doesn't errback when the implementation encounters an error.
Also, `failIfNotConnected` should never raise an exception.
These sound like bugs.
File a couple tickets. With a unit tests please. :)
That's one side of things, sure, but how can I write a scheduler which handles the current situation?
Beats me. I don't know exactly what bug you're hitting or how it's triggered, so I can't suggest a fix. A unit test would clarify the situation. Jean-Paul
On 29/09/2010 22:06, exarkun@twistedmatrix.com wrote:
That's one side of things, sure, but how can I write a scheduler which handles the current situation?
Beats me.
So, if some buggy code that should be doing a deferred callback/errback instead raises an exception, you're basically screwed? There's really no way to write a "safety belt" handler that will record the problem but then keep going?! Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk
On Sep 29, 2010, at 6:56 PM, Chris Withers wrote:
On 29/09/2010 22:06, exarkun@twistedmatrix.com wrote:
That's one side of things, sure, but how can I write a scheduler which handles the current situation?
Beats me.
So, if some buggy code that should be doing a deferred callback/errback instead raises an exception, you're basically screwed?
There's really no way to write a "safety belt" handler that will record the problem but then keep going?!
Chris
There are so many ways. You can add a log observer that handles isError=True log messages. You can write a 'safetyBelt' function that catches the exception and does something with it. You can always invoke your code with maybeDeferred, which will turn exceptions into failures for you. You can use inlineCallbacks, where in 'x = yield y()', y() raising an exception and y() returning a failed Deferred are basically indistinguishable to the caller. And there are hundreds of subtle variations on each of these approaches. Why would you say there's no way?
On 30/09/2010 00:31, Glyph Lefkowitz wrote:
There are so many ways.
Yes, that's what I thought, which is why I was confused to keep on getting "fix the code" responses when I'd already pointed out it wasn't my code to fix, and I'd like to build a solution that caters even for this eventually...
You can add a log observer that handles isError=True log messages.
Okay, and this would mean that my `loop` function from the example I posted would keep getting called by LoopingCall?
You can write a 'safetyBelt' function that catches the exception and does something with it.
From my original example, I'm pretty sure that's what my `loop` function is; I'm confused as to why it catches a GeneratorExit when the attribute error is indirectly raised in the `doStuff` function, and more confused that even thought the GeneratorExit has been caught, the calls to `loop` from LoopingCall cease...
You can always invoke your code with maybeDeferred, which will turn exceptions into failures for you.
In "the real app", we have another layer or two between doStuff and Break, one of which does call down with maybeDeferred, doesn't make any difference...
You can use inlineCallbacks, where in 'x = yield y()', y() raising an exception and y() returning a failed Deferred are basically indistinguishable to the caller.
Yes, as you can see, this is what `loop` does. In "the real app", doStuff is actually @inlineCallbacks as well, the level below that calls with a maybeDeferred and so yes, doing this, same result...
Why would you say there's no way?
Because I haven't found any permutation that doesn't result in the LoopingCall's calls to `loop` from stopping after the exception. I would be more than ecstatic to be proved wrong ;-) Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk
On 12:52 am, chris@simplistix.co.uk wrote:
Because I haven't found any permutation that doesn't result in the LoopingCall's calls to `loop` from stopping after the exception.
I would be more than ecstatic to be proved wrong ;-)
You keep saying the LoopingCall stops. It does not stop. It is waiting for a result. If you provide it with a result, it will proceed. Glyph suggested a number of (hackfully) mechanisms you might use to provide it with a result. I suggested another way (fix the broken code, or at least supply enough information for someone else to fix it). Jean-Paul
On 30/09/2010 04:45, exarkun@twistedmatrix.com wrote:
On 12:52 am, chris@simplistix.co.uk wrote:
Because I haven't found any permutation that doesn't result in the LoopingCall's calls to `loop` from stopping after the exception.
I would be more than ecstatic to be proved wrong ;-)
You keep saying the LoopingCall stops. It does not stop. It is waiting for a result.
What result is it waiting for and how do I pass that from the `loop` function in my first example when the exception is caught?
If you provide it with a result, it will proceed.
See above: how do I do that?
Glyph suggested a number of (hackfully) mechanisms you might use to provide it with a result.
Why do you say hackfully? I'm trying to build a scheduler that is resilient to failure in a scheduled task. Yes, of course I want to fix that failure, but I don't want a failure in one scheduled task to prevent any further scheduled tasks from being executed...
I suggested another way (fix the broken code, or at least supply enough information for someone else to fix it).
For all I know, it's already fixed in a newer version of Twisted (unfortunately, I can't move this project to a newer version of Twisted) but once I have a robust scheduler, I'll certainly do my best to report the underlying failure to you guys properly... Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk
On 30/09/10 11:48, Chris Withers wrote:
On 30/09/2010 04:45, exarkun@twistedmatrix.com wrote:
On 12:52 am, chris@simplistix.co.uk wrote:
Because I haven't found any permutation that doesn't result in the LoopingCall's calls to `loop` from stopping after the exception.
I would be more than ecstatic to be proved wrong ;-)
You keep saying the LoopingCall stops. It does not stop. It is waiting for a result.
What result is it waiting for and how do I pass that from the `loop` function in my first example when the exception is caught?
It is waiting for the deferred you return to callback/errback. This, I found, was initially non-obvious. I assumed that LoopingCall would be called every N seconds regardless, but it won't. If you return a deferred, the LoopingCall will stop until the deferred callback/errback's, then it will be restarted. This is very useful in some circumstances.
If you provide it with a result, it will proceed.
See above: how do I do that?
Glyph suggested a number of (hackfully) mechanisms you might use to provide it with a result.
Why do you say hackfully? I'm trying to build a scheduler that is
Because this is (apparently) a bug in the underlying Twisted code, so anything you do to fix that is a hack. The quickest way I can describe it is this; as I understand, any Twisted function which returns a deferred should never raise an Exception. The exception should instead flow down the errback. This is because (as you've discovered) there are circumstances when a function is called in Twisted where the exception can't propagate "up" to the caller, because the caller is no longer higher in the call stack. Most prominent is reactor.callLater.
resilient to failure in a scheduled task. Yes, of course I want to fix that failure, but I don't want a failure in one scheduled task to prevent any further scheduled tasks from being executed...
I suggested another way (fix the broken code, or at least supply enough information for someone else to fix it).
For all I know, it's already fixed in a newer version of Twisted (unfortunately, I can't move this project to a newer version of Twisted) but once I have a robust scheduler, I'll certainly do my best to report the underlying failure to you guys properly...
If you can produce a short example that uses the real Twisted API which fails, people might a) be able to find/fix the bug and b) suggest a less hacky way of ensuring you see the error.
On 30/09/10 13:44, Phil Mayers wrote:
This is because (as you've discovered) there are circumstances when a function is called in Twisted where the exception can't propagate "up" to the caller, because the caller is no longer higher in the call stack. Most prominent is reactor.callLater.
One quick thing you might be able to do is use t.i.task.deferLater instead of reactor.callLater, like so: def doStuff(): breakobj = Break() d = task.deferLater(reactor, 2, breakobj) return d task.deferLater will schedule the call, but runs the call in a context which means any exceptions raised flow down the deferred errback. This: @inlineCallbacks def loop(): try: yield doStuff() except Exception, e: log.err(e) ...will then work, and "see" the exception.
On 30/09/2010 14:03, Phil Mayers wrote:
On 30/09/10 13:44, Phil Mayers wrote:
This is because (as you've discovered) there are circumstances when a function is called in Twisted where the exception can't propagate "up" to the caller, because the caller is no longer higher in the call stack.
Indeed, but, as I keep saying, I need to build a scheduler that's resilient to errors in the tasks its running ;-)
Most prominent is reactor.callLater.
...which I'm not actually using, it was just a small-as-possible way I could simulate the failure mode (rather than the specific failure) I need to protect against.
@inlineCallbacks def loop(): try: yield doStuff() except Exception, e: log.err(e)
...will then work, and "see" the exception.
Actually, what appears to work is simply changing `loop` to not be an async fuction: def loop(): try: doStuff() except Exception,e: log.err(None,'Unhandled scheduled exception') looper = task.LoopingCall(loop) looper.start(1.0) reactor.run() This appears to solve most of the problems: - the schedule now keeps running regardless - exceptions in doStuff and below get logged However, it now has the problem that if doStuff *does* return a deferred and it errbacks, the I get the ugly: 2010-09-30 14:31:12,478 ERROR : log (22159|7fc6c5c356e0): Unhandled error in Deferred: 2010-09-30 14:31:12,479 ERROR : log (22159|7fc6c5c356e0): Unhandled Error Traceback (most recent call last): Failure: twisted.python.failure.DefaultException: Break! ...rather than having my exception handler log with a meaningful message. I've attached the full example as I have it again... Is there any way I can get both errbacks *and* exceptions handled nicely in my `loop` call? cheers, Chris
On 30/09/2010 14:39, Jonathan Lange wrote:
On Thu, Sep 30, 2010 at 2:36 PM, Chris Withers<chris@simplistix.co.uk> wrote: ...
Is there any way I can get both errbacks *and* exceptions handled nicely in my `loop` call?
You know about defer.maybeDeferred, right?
Yep, the problem is with `loop` implemented like so: def loop(): d = maybeDeferred(doStuff) d.addErrback(partial(log.err,_why='Unhandled scheduled exception')) ...the logging is odd: 2010-09-30 15:07:03,161 ERROR : log (22194|7f41910b26e0): Unhandled Error Traceback (most recent call last): File "test_looping.py", line 47, in <module> reactor.run() File "/twisted/internet/base.py", line 1166, in run self.mainLoop() File "/twisted/internet/base.py", line 1175, in mainLoop self.runUntilCurrent() --- <exception caught here> --- File "/twisted/internet/base.py", line 779, in runUntilCurrent call.func(*call.args, **call.kw) File "test_looping.py", line 30, in __call__ del self.connector exceptions.AttributeError: Break instance has no attribute 'connector' called 4 called 5 /twisted/internet/defer.py:262: DeprecationWarning: Don't pass strings (like 'Break!') to failure.Failure (replacing with a DefaultException). fail = failure.Failure(fail) 2010-09-30 15:07:05,167 ERROR : log (22194|7f41910b26e0): Unhandled scheduled exception Traceback (most recent call last): Failure: twisted.python.failure.DefaultException: Break! So, how come my log.err doesn't get used for the AttributeError on connector? cheers, Chris
On 30/09/10 15:13, Chris Withers wrote:
On 30/09/2010 14:39, Jonathan Lange wrote:
On Thu, Sep 30, 2010 at 2:36 PM, Chris Withers<chris@simplistix.co.uk> wrote: ...
Is there any way I can get both errbacks *and* exceptions handled nicely in my `loop` call?
You know about defer.maybeDeferred, right?
Yep, the problem is with `loop` implemented like so:
def loop(): d = maybeDeferred(doStuff) d.addErrback(partial(log.err,_why='Unhandled scheduled exception'))
So, how come my log.err doesn't get used for the AttributeError on connector?
If you mean in your most recent "test_looping.py" example, it still uses reactor.callLater. The call stack has gone away by the time the call is made, so the exception just propagates up to the top level, where it's logged. I honestly think a more complete example showing the real Twisted API that's causing you unhandled errors would help here.
On 30/09/2010 15:23, Phil Mayers wrote:
def loop(): d = maybeDeferred(doStuff) d.addErrback(partial(log.err,_why='Unhandled scheduled exception'))
So, how come my log.err doesn't get used for the AttributeError on connector?
If you mean in your most recent "test_looping.py" example, it still uses reactor.callLater. The call stack has gone away by the time the call is made, so the exception just propagates up to the top level, where it's logged.
So, I appear to be back to the case where I can either gracefully handle the exception *or* gracefully handle the errback, but not both?
I honestly think a more complete example showing the real Twisted API that's causing you unhandled errors would help here.
Unfortunately, you neither want to see, nor am I allowed to publish to a mailing list, that several hundred lines of proprietary code I'd need to post... As far as the original error goes, we hit the problem using twisted.protocols.ftp.FTPClient to ftp a file up to a remote host. That remote host only accepts active ftp transfers. As a result of either of both of our firewall or FTPClient not handling active ftp and only handling passive ftp, we end up seeing the following logging as the scheduler dies: 2010-09-27 15:30:16,340 ERROR : log (24331|7f2e47b4d6e0): Unhandled exception sending schedule transmission Traceback (most recent call last): File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/python/context.py", line 37, in callWithContext return func(*args,**kw) File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/internet/selectreactor.py", line 146, in _doReadOrWrite why = getattr(selectable, method)() File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/internet/tcp.py", line 631, in doConnect self.failIfNotConnected(error.getConnectError((err, strerror(err)))) File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/internet/tcp.py", line 588, in failIfNotConnected del self.connector --- <exception caught here> --- File "ourcode.py", line 180, in checkSchedule yield self.sendTransmissions(...) exceptions.GeneratorExit: 2010-09-27 15:30:28,428 ERROR : log (24331|7f2e47b4d6e0): Unhandled error in Deferred: 2010-09-27 15:30:28,584 ERROR : log (24331|7f2e47b4d6e0): Unhandled Error Traceback (most recent call last): Failure: twisted.protocols.ftp.FTPError: ('Connection Failed', <twisted.python.failure.Failure <class 'twisted.internet.error.ConnectError'>>) I can't quite make sense of the above, which is why I distilled it down to as-small-as-possible a piece of code that shows the type of exceptions and errbacks I need to deal with... cheers, Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk
On 02:35 pm, chris@simplistix.co.uk wrote:
On 30/09/2010 15:23, Phil Mayers wrote:
def loop(): d = maybeDeferred(doStuff) d.addErrback(partial(log.err,_why='Unhandled scheduled exception'))
So, how come my log.err doesn't get used for the AttributeError on connector?
If you mean in your most recent "test_looping.py" example, it still uses reactor.callLater. The call stack has gone away by the time the call is made, so the exception just propagates up to the top level, where it's logged.
So, I appear to be back to the case where I can either gracefully handle the exception *or* gracefully handle the errback, but not both?
No. Absolutely, categorically wrong. Jean-Paul
On 30/09/10 15:35, Chris Withers wrote:
On 30/09/2010 15:23, Phil Mayers wrote:
def loop(): d = maybeDeferred(doStuff) d.addErrback(partial(log.err,_why='Unhandled scheduled exception'))
So, how come my log.err doesn't get used for the AttributeError on connector?
If you mean in your most recent "test_looping.py" example, it still uses reactor.callLater. The call stack has gone away by the time the call is made, so the exception just propagates up to the top level, where it's logged.
So, I appear to be back to the case where I can either gracefully handle the exception *or* gracefully handle the errback, but not both?
It should be possible if using Twisted APIs correctly to reliably capture errors. If it is not, that is a Twisted bug (or possibly API limitation) However, your "test_looping.py" is not using the API correctly; the reactor.callLater throws away the call stack, so the error has nowhere to go. Since you're not using reactor.callLater in your real code, the example is not valid. Hopefully I'm being more clear here!
I honestly think a more complete example showing the real Twisted API that's causing you unhandled errors would help here.
Unfortunately, you neither want to see, nor am I allowed to publish to a mailing list, that several hundred lines of proprietary code I'd need to post...
I suspected as much. I don't think I can help I'm afraid; maybe other more experience people can spot the error from the tracebacks. Sorry I couldn't be more useful.
On 30/09/2010 15:53, Phil Mayers wrote:
So, I appear to be back to the case where I can either gracefully handle the exception *or* gracefully handle the errback, but not both?
It should be possible if using Twisted APIs correctly to reliably capture errors.
If it is not, that is a Twisted bug (or possibly API limitation)
However, your "test_looping.py" is not using the API correctly; the reactor.callLater throws away the call stack, so the error has nowhere to go.
Right, but that's what appears to be happening with the "real code", and I get the mysterious GeneratorExit...
Since you're not using reactor.callLater in your real code, the example is not valid. Hopefully I'm being more clear here!
Yes, but I do feel a bit like I'm banging my head against a wall too. The effect is the same as the real live observed problem, only I can't replicate the real problem without an ftp server that only happens to accept active connections. I don't have one of those around, and *even* if I did, I want to make the scheduler bulletproof, not *just* go and fix the one error that has cropped up so far... cheers, Chris
On 30/09/10 16:10, Chris Withers wrote:
On 30/09/2010 15:53, Phil Mayers wrote:
So, I appear to be back to the case where I can either gracefully handle the exception *or* gracefully handle the errback, but not both?
It should be possible if using Twisted APIs correctly to reliably capture errors.
If it is not, that is a Twisted bug (or possibly API limitation)
However, your "test_looping.py" is not using the API correctly; the reactor.callLater throws away the call stack, so the error has nowhere to go.
Right, but that's what appears to be happening with the "real code", and I get the mysterious GeneratorExit...
It is more than a little confusing, and I'm sure frustrating. I've tried to produce something like this locally, but cannot. I suspect the various @inlineCallbacks generators are the cause of the specific issue. Let me see if I understand the problem in full. You have an @inlineCallbacks-decorated generator which is the target of a LoopingCall, like so: @inlineCallbacks def loop(): try: somework() except: log.err() lc = task.LoopingCall(loop) You want this loop function to catch & log all exceptions resulting from work it initiates. Your "somework" function calls, amongst other things, an @inlineCallbacks-decorated worker function: @inlineCallbacks def sendTransmission(...): try: yield maybeDeferred(feed.initiateTransmission) except: ...some handling You are seeing two errors: 1. A GeneratorExit exception. This appears (if I'm reading your logging right) to be caught by your logging? 2. A ConnectionLost exception. This is not caught by your logging, and is propagating up to the reactor, and giving "Unhandled Error" Is this an accurate summary?
On 30/09/2010 18:01, Phil Mayers wrote:
It is more than a little confusing, and I'm sure frustrating.
I've tried to produce something like this locally, but cannot.
The very first message in this thread (28th Sept, 14:48) and my email of 30th Sept, 14:36 both had scripts attached which do exactly this..
Let me see if I understand the problem in full.
You have an @inlineCallbacks-decorated generator which is the target of a LoopingCall, like so:
@inlineCallbacks def loop(): try: somework() except: log.err()
lc = task.LoopingCall(loop)
Almost: @inlineCallbacks def loop(self): # Check for files to send on schedule yield self.checkSchedule() @inlineCallbacks def checkSchedule(self): try: yield somework() except Exception: log.err(None,'Unhandled exception ...') ...although I'm currently changing the loop function to be: def loop(self): self.checkSchedule() ...as this appears to give me what I want, until something proves otherwise...
You want this loop function to catch& log all exceptions resulting from work it initiates.
Yep, errbacks, internal twisted bugs, whatever. As long as they're logged by something, preferabyl the try-except above, I don't mind. What absolutely must not cannot ever happen is for the scheduler to die ;-)
Your "somework" function calls, amongst other things, an @inlineCallbacks-decorated worker function:
@inlineCallbacks def sendTransmission(...): try: yield maybeDeferred(feed.initiateTransmission) except: ...some handling
Yep.
You are seeing two errors:
1. A GeneratorExit exception. This appears (if I'm reading your logging right) to be caught by your logging?
Yep.
2. A ConnectionLost exception. This is not caught by your logging, and is propagating up to the reactor, and giving "Unhandled Error"
Correct.
On 01/10/10 14:19, Chris Withers wrote:
On 30/09/2010 18:01, Phil Mayers wrote:
It is more than a little confusing, and I'm sure frustrating.
I've tried to produce something like this locally, but cannot.
The very first message in this thread (28th Sept, 14:48) and my email of 30th Sept, 14:36 both had scripts attached which do exactly this..
If your examples scripts genuinely reflect the problem (a bit of code, outside your control, is raising an exception in a context where the error doesn't flow back to user code) then Glyph is (of course!) right - you're not going to be able to capture and log the error "nicely". It will be propagated upwards until it hits the reactor.
You want this loop function to catch& log all exceptions resulting from work it initiates.
Yep, errbacks, internal twisted bugs, whatever. As long as they're logged by something, preferabyl the try-except above, I don't mind. What absolutely must not cannot ever happen is for the scheduler to die ;-)
Now *this* you should be able to do. A LoopingCall will only die (i.e. stop permanently) if you: 1. Raise an exception 2. Return a deferred which errback's (The former gets turned into the latter) If you ensure your loop never ever does this, it'll run forever. HOWEVER: as JP has mentioned, a LoopingCall *will* pause if you return a deferred, until that deferred callback's (or errback's, then it dies) So this: def loop(): try: somework() except: pass ...should never ever die (because it can't raise an exception) or stop (because it doesn't return a deferred)
On 02:13 pm, chris@simplistix.co.uk wrote:
On 30/09/2010 14:39, Jonathan Lange wrote:
On Thu, Sep 30, 2010 at 2:36 PM, Chris Withers<chris@simplistix.co.uk> wrote: ...
Is there any way I can get both errbacks *and* exceptions handled nicely in my `loop` call?
You know about defer.maybeDeferred, right?
Yep, the problem is with `loop` implemented like so:
def loop(): d = maybeDeferred(doStuff) d.addErrback(partial(log.err,_why='Unhandled scheduled exception'))
Names that begin with an underscore are private. Also, partial is a pointless complexification here. def loop(): d = maybeDeferred(doStuff) d.addErrback(log.err, 'Unhandled scheduled exception')
...the logging is odd:
2010-09-30 15:07:03,161 ERROR : log (22194|7f41910b26e0): Unhandled Error Traceback (most recent call last): File "test_looping.py", line 47, in <module> reactor.run() File "/twisted/internet/base.py", line 1166, in run self.mainLoop() File "/twisted/internet/base.py", line 1175, in mainLoop self.runUntilCurrent() --- <exception caught here> --- File "/twisted/internet/base.py", line 779, in runUntilCurrent call.func(*call.args, **call.kw) File "test_looping.py", line 30, in __call__ del self.connector exceptions.AttributeError: Break instance has no attribute 'connector'
This is not logged by your code. Do you recognize that?
called 4 called 5 /twisted/internet/defer.py:262: DeprecationWarning: Don't pass strings (like 'Break!') to failure.Failure (replacing with a DefaultException). fail = failure.Failure(fail) 2010-09-30 15:07:05,167 ERROR : log (22194|7f41910b26e0): Unhandled scheduled exception Traceback (most recent call last): Failure: twisted.python.failure.DefaultException: Break!
This comes from some code not included in the code you posted. It looks like you're using Failure wrong though. Quoting from above: Don't pass strings (like 'Break!') to failure.Failure.
So, how come my log.err doesn't get used for the AttributeError on connector?
I'm sure it doesn't help for me to simply repeat myself, since if it didn't make sense the first time probably won't make sense the third or fourth time. I don't know what else to say though. log.err is an errback on the Deferred you created in loop. An errback is called (roughly) when a Deferred fires with a Failure. Your Deferred *never* fires with a Failure corresponding to that AttributeError. This is the most important thing. If you don't understand this, say so and we can talk about it some more. Everything else is just confusing particulars. Unfortunately maybeDeferred cannot help here. It solves a slightly related problem and does nothing about this one. You may well want to use it, but it's not going to address your `AttributeError` issue in any way, because of what I said above, and will repeat here for emphasis: The AttributeError never becomes an error result on any Deferred. It is caught inside the reactor implementation, logged there by Twisted itself, and then thrown away forever. Jean-Paul
On 30/09/2010 15:33, exarkun@twistedmatrix.com wrote:
2010-09-30 15:07:03,161 ERROR : log (22194|7f41910b26e0): Unhandled Error Traceback (most recent call last): File "test_looping.py", line 47, in<module> reactor.run() File "/twisted/internet/base.py", line 1166, in run self.mainLoop() File "/twisted/internet/base.py", line 1175, in mainLoop self.runUntilCurrent() ---<exception caught here> --- File "/twisted/internet/base.py", line 779, in runUntilCurrent call.func(*call.args, **call.kw) File "test_looping.py", line 30, in __call__ del self.connector exceptions.AttributeError: Break instance has no attribute 'connector'
This is not logged by your code. Do you recognize that?
Yes, this is what I'm complaining about ;-)
/twisted/internet/defer.py:262: DeprecationWarning: Don't pass strings (like 'Break!') to failure.Failure (replacing with a DefaultException). fail = failure.Failure(fail) 2010-09-30 15:07:05,167 ERROR : log (22194|7f41910b26e0): Unhandled scheduled exception Traceback (most recent call last): Failure: twisted.python.failure.DefaultException: Break!
This comes from some code not included in the code you posted.
Sure it is, it was attached to the message I sent at 14:36.
It looks like you're using Failure wrong though.
Sure, but that's hardly the issue at hand here... What should I be passing to errback?
So, how come my log.err doesn't get used for the AttributeError on connector?
Your Deferred *never* fires with a Failure corresponding to that AttributeError. This is the most important thing. If you don't understand this, say so and we can talk about it some more. Everything else is just confusing particulars.
Yes, I understand this, and this is what I'm talking about when I say "I cannot gracefully handle the exception." Reading back, yes, it appears I was mistaken at some stage that my `loop` function was handling the exception, but I do understand now that it was not... which is frustrating...
The AttributeError never becomes an error result on any Deferred. It is caught inside the reactor implementation, logged there by Twisted itself, and then thrown away forever.
:-( Chris
On 30/09/10 14:36, Chris Withers wrote:
Indeed, but, as I keep saying, I need to build a scheduler that's resilient to errors in the tasks its running ;-)
Sure.
Most prominent is reactor.callLater.
...which I'm not actually using, it was just a small-as-possible way I could simulate the failure mode (rather than the specific failure) I need to protect against.
Ah.
Actually, what appears to work is simply changing `loop` to not be an async fuction:
def loop(): try: doStuff() except Exception,e: log.err(None,'Unhandled scheduled exception')
looper = task.LoopingCall(loop) looper.start(1.0) reactor.run()
This appears to solve most of the problems: - the schedule now keeps running regardless - exceptions in doStuff and below get logged
However, it now has the problem that if doStuff *does* return a deferred and it errbacks, the I get the ugly:
Well, you could do this, which is more Twist-y, and Glyph suggested: def loop(): d = defer.maybeDeferred(doStuff, *args) d.addErrback(log.err) This will call doStuff, converting return/exception into callback/errback, or if the call returns a deferred itself, just return that. FWIW, you say "if doStuff does return a deferred"; I presume you don't really have a single call returning both normal and deferred values, and it's actually several different function calls to different pieces of work.
Is there any way I can get both errbacks *and* exceptions handled nicely in my `loop` call?
I think you want maybeDeferred, as above.
On 30/09/2010 14:46, Phil Mayers wrote:
On 30/09/10 14:36, Chris Withers wrote:
Indeed, but, as I keep saying, I need to build a scheduler that's resilient to errors in the tasks its running ;-)
Sure.
Most prominent is reactor.callLater.
...which I'm not actually using, it was just a small-as-possible way I could simulate the failure mode (rather than the specific failure) I need to protect against.
Ah.
Actually, what appears to work is simply changing `loop` to not be an async fuction:
def loop(): try: doStuff() except Exception,e: log.err(None,'Unhandled scheduled exception')
looper = task.LoopingCall(loop) looper.start(1.0) reactor.run()
This appears to solve most of the problems: - the schedule now keeps running regardless - exceptions in doStuff and below get logged
However, it now has the problem that if doStuff *does* return a deferred and it errbacks, the I get the ugly:
Well, you could do this, which is more Twist-y, and Glyph suggested:
def loop(): d = defer.maybeDeferred(doStuff, *args) d.addErrback(log.err)
Okay, but further down in the call stack of the "real app", I already have: @inlineCallbacks def sendTransmission(self...): ... try: yield maybeDeferred(feed.initiateTransmission, ...) except Exception, ex: if isinstance(ex,GeneratorExit): raise ... Is that not doing what you're talking about? If it is, it didn't help...
FWIW, you say "if doStuff does return a deferred"; I presume you don't really have a single call returning both normal and deferred values,
Correct, doStuff *should* always return a deferred, the problem is that the deferred returned never fires (errback or callback) when an exception is raised in the code that should be firing it. Chris
Chris, My original reply jumped into the middle of the thread and I didn't completely understand what you were asking. Let's get back to your original message: On Sep 28, 2010, at 9:48 AM, Chris Withers wrote:
The loop() function is supposed to be a "catch all and report" error handler. However, when the async code initiated by doStuff throws the AttributeError, what actually gets logged is:
2010-09-28 14:41:15,706 ERROR : log (14757|7f691ba676e0): Unhandled Error Traceback (most recent call last): File "test_looping.py", line 41, in <module> reactor.run() File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/internet/base.py", line 1128, in run self.mainLoop() File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/internet/base.py", line 1137, in mainLoop self.runUntilCurrent() --- <exception caught here> --- File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/internet/base.py", line 757, in runUntilCurrent call.func(*call.args, **call.kw) File "test_looping.py", line 24, in __call__ del self.connector exceptions.AttributeError: Break instance has no attribute 'connector'
This traceback indicates a bug in Twisted. This is an error that other people have very occasionally spotted, and we have never been able to reliably reproduce it. Over the years we have tried to diagnose it in various ways and we have always failed. It would be really valuable to everyone if you could write up a bug and provide detailed reproduction instructions, ideally with some python code that triggers this error, so that we can address the bug. It would be super useful if you could write an example that reproduces the same bug on a recent Twisted version (8.2 is pretty old), especially the 10.2 prerelease. But, if your example reproduces on 8.2 and not 10.0, that tells us something too. A unit test would be best, but any complete script which reproduces the problem reliably would be extremely useful; we can transform it into a unit test ourselves. To reiterate, as this pertains to your problem: there is no way for your application code to fix this exception, because it is happening in the guts of the reactor and entirely out of your control. It is going to screw things up in some pretty arbitrary ways. The only way to deal with it would be to figure out why it happens and come up with a proper fix for the bug. And we can't do that without a way to reproduce the bug. Thanks for your help, -glyph
Hi Glyph, On 30/09/2010 19:52, Glyph Lefkowitz wrote:
File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/internet/base.py", line 757, in runUntilCurrent call.func(*call.args, **call.kw) File "test_looping.py", line 24, in __call__ del self.connector exceptions.AttributeError: Break instance has no attribute 'connector'
This traceback indicates a bug in Twisted.
This is an error that other people have very occasionally spotted, and we have never been able to reliably reproduce it.
While I'm able to reproduce this, I can't really as it involves a customer ftp server also generating error messages and they've already had enough of them ;-)
Over the years we have tried to diagnose it in various ways and we have always failed.
It would be really valuable to everyone if you could write up a bug and
Is there a bug already tracking this?
provide detailed reproduction instructions, ideally with some python code that triggers this error, so that we can address the bug. It would be super useful if you could write an example that reproduces the same bug on a recent Twisted version (8.2 is pretty old), especially the 10.2 prerelease. But, if your example reproduces on 8.2 and not 10.0, that tells us something too.
Well, unfortunately I can't do any of the above :-( I've only seen this, as I said previously, when we use twisted.protocols.ftp.FTPClient to try to send a file to an ftp server that was only able to handle active ftp transfers when our setup only allowed passive ftp transfers. It was happening in a steady, reproducible fashion (ie: every time we tried) but unfortunately we've had to bail and now send via sftp using a pexpect-based wrapper around the real sftp command client (we had buggy behaviour with twisted's sftp client elsewhere) I hope this helps, do let me know if I should add it to an issue somewhere... cheers, Chris
On 01:30 pm, chris@simplistix.co.uk wrote:
Hi Glyph,
On 30/09/2010 19:52, Glyph Lefkowitz wrote:
File "Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/internet/base.py", line 757, in runUntilCurrent call.func(*call.args, **call.kw) File "test_looping.py", line 24, in __call__ del self.connector exceptions.AttributeError: Break instance has no attribute 'connector'
This traceback indicates a bug in Twisted.
This is an error that other people have very occasionally spotted, and we have never been able to reliably reproduce it.
While I'm able to reproduce this, I can't really as it involves a customer ftp server also generating error messages and they've already had enough of them ;-)
Over the years we have tried to diagnose it in various ways and we have always failed.
It would be really valuable to everyone if you could write up a bug and
Is there a bug already tracking this?
provide detailed reproduction instructions, ideally with some python code that triggers this error, so that we can address the bug. It would be super useful if you could write an example that reproduces the same bug on a recent Twisted version (8.2 is pretty old), especially the 10.2 prerelease. But, if your example reproduces on 8.2 and not 10.0, that tells us something too.
Well, unfortunately I can't do any of the above :-( I've only seen this, as I said previously, when we use twisted.protocols.ftp.FTPClient to try to send a file to an ftp server that was only able to handle active ftp transfers when our setup only allowed passive ftp transfers.
It was happening in a steady, reproducible fashion (ie: every time we tried) but unfortunately we've had to bail and now send via sftp using a pexpect-based wrapper around the real sftp command client (we had buggy behaviour with twisted's sftp client elsewhere)
I hope this helps, do let me know if I should add it to an issue somewhere...
Can you at least provide a traffic capture recording an instance of this happening? Jean-Paul
On 01/10/2010 15:00, exarkun@twistedmatrix.com wrote:
I hope this helps, do let me know if I should add it to an issue somewhere...
Can you at least provide a traffic capture recording an instance of this happening?
I'm afraid not, I didn't have the forsight to do this at the time, and as I've said, the customer has had enough of us causing their ftp server to generate error messages :-S Chris
participants (5)
-
Chris Withers
-
exarkun@twistedmatrix.com
-
Glyph Lefkowitz
-
Jonathan Lange
-
Phil Mayers