[Twisted-Python] Failure is O(state) vs Exception
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I'm doing some performance testing of one of our Twisted applications. And what I came across was a surprising amount of time being spent in twisted.python.failure.Failure.__getstate__ We're doing a fair amount with exceptions. And under cProfile, I found this: 29 0 0.4081 0.2940 twisted.python.failure:416(__getstate__) +34492 0 0.1141 0.0235 +twisted.python.reflect:557(safe_repr) So under profiling, we spent 408ms in __getstate__. I then changed Failure.cleanFailure to just 'return', and I saw a real-world improvement of ~480ms down to about 240ms. I then restored cleanFailure, but changed Failure.__init__ to always set 'tb=None' before it does anything. And in that case the time went down to 180ms. (And when I dug into it, 150ms of that is time spent waiting for an XMLRPC response.) I'm wondering if there is a tasteful way to disable Traceback processing in a production machine. I realize you never know when you are going to need the state in order to figure out what went wrong. But it is costing 2-5x runtime speed. (The other answer is to write all of our code to avoid Deferred.addErrback...) In our case, there are many exceptions that are used as 'signaling' rather than indicating a real failure. I suppose one option would be a whitelist/blacklist that would indicate whether a given exception class is worthy of a traceback. Thoughts? John =:-> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk2QntIACgkQJdeBCYSNAAPNPACeNtNUb1mifyTTSpAymcktWwQg 3UAAoLcbbADvaj2QYSxkgFnRdmmWjtPm =SLqh -----END PGP SIGNATURE-----
On Mon, Mar 28, 2011 at 9:44 AM, John Arbash Meinel <john@arbash-meinel.com>wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
I'm doing some performance testing of one of our Twisted applications. And what I came across was a surprising amount of time being spent in twisted.python.failure.Failure.__getstate__
So under profiling, we spent 408ms in __getstate__. I then changed Failure.cleanFailure to just 'return', and I saw a real-world improvement of ~480ms down to about 240ms. I then restored cleanFailure, but changed Failure.__init__ to always set 'tb=None' before it does anything. And in that case the time went down to 180ms. (And when I dug into it, 150ms of that is time spent waiting for an XMLRPC response.)
Just for the record, changing cleanFailure to not do its operation is likely to lead to big memory leaks. The second thing you did, setting self.tb to None, makes more sense to speed things up.
I'm wondering if there is a tasteful way to disable Traceback processing in a production machine. I realize you never know when you are going to need the state in order to figure out what went wrong. But it is costing 2-5x runtime speed. (The other answer is to write all of our code to avoid Deferred.addErrback...)
I hate to mention it (since it's unlikely I'd be able to work on it), but I wonder how much a C extension would improve this without going the full step of throwing out the debug data. -- Christopher Armstrong http://radix.twistedmatrix.com/ http://planet-if.com/
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 03/28/2011 05:02 PM, Christopher Armstrong wrote:
On Mon, Mar 28, 2011 at 9:44 AM, John Arbash Meinel <john@arbash-meinel.com <mailto:john@arbash-meinel.com>> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
I'm doing some performance testing of one of our Twisted applications. And what I came across was a surprising amount of time being spent in twisted.python.failure.Failure.__getstate__
So under profiling, we spent 408ms in __getstate__. I then changed Failure.cleanFailure to just 'return', and I saw a real-world improvement of ~480ms down to about 240ms. I then restored cleanFailure, but changed Failure.__init__ to always set 'tb=None' before it does anything. And in that case the time went down to 180ms. (And when I dug into it, 150ms of that is time spent waiting for an XMLRPC response.)
Just for the record, changing cleanFailure to not do its operation is likely to lead to big memory leaks. The second thing you did, setting self.tb to None, makes more sense to speed things up.
Right. I agree that it is a better solution. I was wondering about having a whitelist of Exception classes that aren't considered interesting enough to keep a traceback.
I'm wondering if there is a tasteful way to disable Traceback processing in a production machine. I realize you never know when you are going to need the state in order to figure out what went wrong. But it is costing 2-5x runtime speed. (The other answer is to write all of our code to avoid Deferred.addErrback...)
I hate to mention it (since it's unlikely I'd be able to work on it), but I wonder how much a C extension would improve this without going the full step of throwing out the debug data.
A large portion of the time spent is in "safe_repr", which presumably is turning all 30k objects into 'str' representations. I would imagine that a lot of this is redundant across state. Meaning that the objects up at the top frame probably don't change very often. I think part of the issue is that when you have an unhandled exception, that's when you really want the traceback, but you only know that long after the traceback is gone. John =:-> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk2QpSwACgkQJdeBCYSNAANHhwCgj3J5a6/FgV1n/O2Zwm8U9pm7 eTgAoKWemYqgHnvMwnCrQA+uFZK/+Zeo =x5Xm -----END PGP SIGNATURE-----
John Arbash Meinel wrote: […]
Just for the record, changing cleanFailure to not do its operation is likely to lead to big memory leaks. The second thing you did, setting self.tb to None, makes more sense to speed things up.
Right. I agree that it is a better solution. I was wondering about having a whitelist of Exception classes that aren't considered interesting enough to keep a traceback.
I've filed <http://twistedmatrix.com/trac/ticket/5011> about this issue, and I have a patch that takes an alternative approach: don't capture tracebacks when Deferred callbacks raise exceptions unless defer.setDebugging(True) has been called. Another possibility might be to stop Failure from capturing all the frame details (i.e. the locals and globals) by default (or at least when invoked by Deferred._runCallbacks), as that's probably the bulk of the expense and would be very rarely used. -Andrew.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 03/30/2011 10:39 AM, Andrew Bennetts wrote:
John Arbash Meinel wrote: […]
Just for the record, changing cleanFailure to not do its operation is likely to lead to big memory leaks. The second thing you did, setting self.tb to None, makes more sense to speed things up.
Right. I agree that it is a better solution. I was wondering about having a whitelist of Exception classes that aren't considered interesting enough to keep a traceback.
I've filed <http://twistedmatrix.com/trac/ticket/5011> about this issue, and I have a patch that takes an alternative approach: don't capture tracebacks when Deferred callbacks raise exceptions unless defer.setDebugging(True) has been called.
Another possibility might be to stop Failure from capturing all the frame details (i.e. the locals and globals) by default (or at least when invoked by Deferred._runCallbacks), as that's probably the bulk of the expense and would be very rarely used.
-Andrew.
I think walking the frames and copying the dicts is also expensive. That is what the bug you linked to me was about. (First, walking everything and using __dict__.copy() was a bit expensive, and second that the safe_repr() calls were turning 1GB strings into a new 1GB+ string.) The one other step that I think we need, is that 'maybeDeferred' also always traps into a Failure object, and we'd want that to check Deferred.debug first. I do wonder if Failure should just be checking Deferred.debug before automatically including a traceback. I'm not really sure about logical layering of twisted modules, though. Certainly 'twisted.python.failure' seems a lower layer than 'twisted.internet.defer'. John =:-> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk2S7uAACgkQJdeBCYSNAAPS9ACfdGmt9td2jqOFDZYY6mJy/6DN UiMAnRF0w9pID0WMP0MMGwh9LSQN9RqM =g3j7 -----END PGP SIGNATURE-----
John Arbash Meinel wrote: […]
I think walking the frames and copying the dicts is also expensive. That is what the bug you linked to me was about. (First, walking everything and using __dict__.copy() was a bit expensive, and second that the safe_repr() calls were turning 1GB strings into a new 1GB+ string.)
Well, if we take my proposal to not (by default at least) capture the frames in the first place this isn't an issue is it? The only other copy done in Failure is the copy of self.__dict__ in __getstate__, which is just a shallow copy of one dict, so fairly cheap.
The one other step that I think we need, is that 'maybeDeferred' also always traps into a Failure object, and we'd want that to check Deferred.debug first.
Good point!
I do wonder if Failure should just be checking Deferred.debug before automatically including a traceback. I'm not really sure about logical layering of twisted modules, though. Certainly 'twisted.python.failure' seems a lower layer than 'twisted.internet.defer'.
Yes, checking Deferred.debug in twisted.python.failure would be bad layering. twisted.internet.defer only invokes Failure in a couple of places, so it's not so onerous to make sure it invokes it to avoid capturing tracebacks unless it means to. That said, it might be a good idea to change Failure to *not* capture tracebacks by default when invoked as Failure(). Perhaps add a setDebugging toggle to twisted.python.failure too. If an explicit traceback is passed to the constructor it would still be captured, and perhaps provide an alternative constructor for the current “capture everything” behaviour (perhaps via a new optional flag for __init__, perhaps via a new name DetailedFailure(), perhaps via a classmethod except failure is not a new-style class and can't be one without changing to subclass Exception…) for the rare cases when people want that. It's a pity you can't just use Failure(*sys.exc_info()) because the parameters are in the wrong order. Certainly the current behaviour of doing a costly capturing of traceback and frame contents seems like the wrong default, given how rarely I've seen anyone use Failure.printDetailedTraceback. -Andrew.
On Mar 30, 2011, at 5:14 AM, Andrew Bennetts wrote:
John Arbash Meinel wrote: […]
I think walking the frames and copying the dicts is also expensive. That is what the bug you linked to me was about. (First, walking everything and using __dict__.copy() was a bit expensive, and second that the safe_repr() calls were turning 1GB strings into a new 1GB+ string.)
Well, if we take my proposal to not (by default at least) capture the frames in the first place this isn't an issue is it? The only other copy done in Failure is the copy of self.__dict__ in __getstate__, which is just a shallow copy of one dict, so fairly cheap.
The one other step that I think we need, is that 'maybeDeferred' also always traps into a Failure object, and we'd want that to check Deferred.debug first.
Good point!
I do wonder if Failure should just be checking Deferred.debug before automatically including a traceback. I'm not really sure about logical layering of twisted modules, though. Certainly 'twisted.python.failure' seems a lower layer than 'twisted.internet.defer'.
Yes, checking Deferred.debug in twisted.python.failure would be bad layering. twisted.internet.defer only invokes Failure in a couple of places, so it's not so onerous to make sure it invokes it to avoid capturing tracebacks unless it means to.
That said, it might be a good idea to change Failure to *not* capture tracebacks by default when invoked as Failure(). Perhaps add a setDebugging toggle to twisted.python.failure too. If an explicit traceback is passed to the constructor it would still be captured, and perhaps provide an alternative constructor for the current “capture everything” behaviour (perhaps via a new optional flag for __init__, perhaps via a new name DetailedFailure(), perhaps via a classmethod except failure is not a new-style class and can't be one without changing to subclass Exception…) for the rare cases when people want that. It's a pity you can't just use Failure(*sys.exc_info()) because the parameters are in the wrong order.
Certainly the current behaviour of doing a costly capturing of traceback and frame contents seems like the wrong default, given how rarely I've seen anyone use Failure.printDetailedTraceback.
Agreed. I've seen this behavior crop up shocking close to the top of profiles for Calendar Server as well, so I'm very glad to hear you've undertaken this work. The current default is almost certainly wrong; the only code which has practically ever used it was the HTML traceback stuff, which most serious users disable for security reasons anyway. Debugging is kind of a process-global thing, most of the time. I think maybe we should have 'twisted.python.debug' which is the main thing that all these features use, and then a setDebugging for each system (Failure, Deferred, reactor; and ideally, eventually stuff like web, mail, conch too) to turn on these expensive-but-occasionally-worthwhile features. But I'd be happy if this change did _nothing_ but make Failure simply default to not capturing globals and locals, and add a flag to explicitly request this behavior. Like you say: the information is always captured but almost never used. -glyph
Glyph Lefkowitz wrote: […]
Agreed. I've seen this behavior crop up shocking close to the top of profiles for Calendar Server as well, so I'm very glad to hear you've undertaken this work. The current default is almost certainly wrong; the only code which has practically ever used it was the HTML traceback stuff, which most serious users disable for security reasons anyway.
Good to know we're not the only ones :) The fact that it speeds up 'trial twisted' by ~10-15% is a nice bonus too!
Debugging is kind of a process-global thing, most of the time. I think maybe we should have 'twisted.python.debug' which is the main thing that all these features use, and then a setDebugging for each system (Failure, Deferred, reactor; and ideally, eventually stuff like web, mail, conch too) to turn on these expensive-but-occasionally-worthwhile features.
A similar scheme has worked well for bzrlib. +1
But I'd be happy if this change did _nothing_ but make Failure simply default to not capturing globals and locals, and add a flag to explicitly request this behavior. Like you say: the information is always captured but almost never used.
That's fine with me! I'll update the patch to add captureVars=False to Failure.__init__ instead of what it currently does, and happily leave defer.py unmodified from trunk. Thanks for the feedback, -Andrew.
participants (4)
-
Andrew Bennetts
-
Christopher Armstrong
-
Glyph Lefkowitz
-
John Arbash Meinel