[Twisted-Python] Deferreds vs sys.getrecursionlimit()

About once every six months, I wind up debugging a python stack overflow in my Deferred-using code. The symptom is usually a log message that ends with: File "/usr/lib/python2.5/site-packages/twisted/internet/defer.py", line 344, in _runCallbacks self.result = failure.Failure() File "/usr/lib/python2.5/site-packages/twisted/python/failure.py", line 265, in __init__ parentCs = reflect.allYourBase(self.type) File "/usr/lib/python2.5/site-packages/twisted/python/reflect.py", line 542, in allYourBase accumulateBases(classObj, l, baseClass) File "/usr/lib/python2.5/site-packages/twisted/python/reflect.py", line 550, in accumulateBases accumulateBases(base, l, baseClass) File "/usr/lib/python2.5/site-packages/twisted/python/reflect.py", line 550, in accumulateBases accumulateBases(base, l, baseClass) File "/usr/lib/python2.5/site-packages/twisted/python/reflect.py", line 550, in accumulateBases accumulateBases(base, l, baseClass) File "/usr/lib/python2.5/site-packages/twisted/python/reflect.py", line 550, in accumulateBases accumulateBases(base, l, baseClass) exceptions.RuntimeError: maximum recursion depth exceeded It's always something weird. This time, I took notes. I offer these hints to help future searchers find a starting point in their own debugging efforts. The executive summary: Certain patterns of using Deferreds work fine while testing, but will fail in mysterious ways later on as the application's workload grows larger. These notes are formatted for Trac, as they were originally written as a comment for http://allmydata.org/trac/tahoe/ticket/237 . Also, the "solutions" suggested require the eventual-send operator as provided by Foolscap, and until/unless reactor.eventually() makes it into Twisted proper, these solutions may not be convenient for projects that aren't already using Foolscap. cheers, -Brian == Problem One: long list of callbacks, all of them are ready == Each Deferred (we'll call the first one Deferred A) has a list of callback functions. Each time you do d.addCallback(), this list grows by one element. When Deferred A fires, the list is executed in a 'while' loop, in Deferred._runCallbacks. If the callbacks all return either a normal value or a Failure, then the list is completely consumed during the one call to _runCallbacks, and everything is fine. However, when a callback returns another Deferred B (chaining), the first Deferred A must wait for the second to finish. The code that does this looks like: {{{ if isinstance(self.result, Deferred): self.pause() self.result.addBoth(self._continue) break }}} The second Deferred B might have already been fired by this point, either because it was born ready (created with defer.succeed, or defer.maybeDeferred), or because whatever was being waited upon has already occurred. If this occurs, the subsequent callback in Deferred A's chain will fire (with B's result), but it will fire through a 6-frame recursive loop instead of firing on the next pass of the 'while' loop. As a result, each such ready-to-fire Deferred will add 6 stack frames. 166 such loops are enough to put more than 1000 frames on the stack, which will exceed Python's default sys.getrecursionlimit() . The 6-frame cycle is: {{{ File "twisted/internet/defer.py", line 214, in addBoth callbackKeywords=kw, errbackKeywords=kw) File "twisted/internet/defer.py", line 186, in addCallbacks self._runCallbacks() File "twisted/internet/defer.py", line 328, in _runCallbacks self.result = callback(self.result, *args, **kw) File "twisted/internet/defer.py", line 289, in _continue self.unpause() File "twisted/internet/defer.py", line 285, in unpause self._runCallbacks() File "twisted/internet/defer.py", line 341, in _runCallbacks self.result.addBoth(self._continue) }}} The following sample code will cause this situation: {{{ import traceback from twisted.internet import defer def fire(res, which): #print "FIRE", which, "stack:", len(traceback.extract_stack()) #if which == 2: # traceback.print_stack() return defer.succeed(None) d = defer.Deferred() for i in range(170): d.addCallback(fire, i) d.callback("go") }}} The exception that this provokes is caught by the Deferred's Failure mechanisms, but then Twisted has an internal failure while trying to capture it. The actual Unhandled error in Deferred that gets put into the logs is: {{{ Unhandled error in Deferred: Traceback (most recent call last): File "twisted/internet/defer.py", line 285, in unpause self._runCallbacks() File "twisted/internet/defer.py", line 341, in _runCallbacks self.result.addBoth(self._continue) File "twisted/internet/defer.py", line 214, in addBoth callbackKeywords=kw, errbackKeywords=kw) File "twisted/internet/defer.py", line 186, in addCallbacks self._runCallbacks() --- <exception caught here> --- File "twisted/internet/defer.py", line 328, in _runCallbacks self.result = callback(self.result, *args, **kw) File "twisted/internet/defer.py", line 289, in _continue self.unpause() File "twisted/internet/defer.py", line 285, in unpause self._runCallbacks() File "twisted/internet/defer.py", line 344, in _runCallbacks self.result = failure.Failure() File "twisted/python/failure.py", line 265, in __init__ parentCs = reflect.allYourBase(self.type) File "twisted/python/reflect.py", line 542, in allYourBase accumulateBases(classObj, l, baseClass) File "twisted/python/reflect.py", line 550, in accumulateBases accumulateBases(base, l, baseClass) File "twisted/python/reflect.py", line 550, in accumulateBases accumulateBases(base, l, baseClass) File "twisted/python/reflect.py", line 550, in accumulateBases accumulateBases(base, l, baseClass) File "twisted/python/reflect.py", line 550, in accumulateBases accumulateBases(base, l, baseClass) exceptions.RuntimeError: maximum recursion depth exceeded }}} This problem frequently shows up in code which returns a Deferred for generality (i.e. some day it might be async), but is using defer.succeed() or defer.maybeDeferred(some_immediate_call) in the meanwhile. == Problem Two: deep chain of callbacks, e.g. recursive delayed polling == The other kind of recursion-limit-violation failures that occurs with Deferreds involves a long chain that finally fires. The most common way to generate such a chain is with a recursive method that separates each call with a Deferred, such as a polling function that returns a Deferred: {{{ def wait_until_done(self, ignored=None): if self.done: return True else: d = Deferred() reactor.callLater(1.0, d.callback, None) d.addCallback(self.wait_until_done) return d }}} If this function must poll more than 331 times, the reactor tick which notices the expired timer and fires d.callback will see a recursion-depth-exceeded exception. The last Deferred fires, which triggers the _continue callback on the next-to-last Deferred, which allows it to fire, which triggers the {{{[-2]}}} Deferred, etc. This recursive cycle is of length 3 and has the following frames: {{{ File "twisted/internet/defer.py", line 328, in _runCallbacks self.result = callback(self.result, *args, **kw) File "twisted/internet/defer.py", line 289, in _continue self.unpause() File "twisted/internet/defer.py", line 285, in unpause self._runCallbacks() }}} This one is trickier to find, because the root of the stack will be some internal reactor call rather than application code. In particular, the bottom of the stack will look like: {{{ File "/tmp/t.py", line 26, in <module> reactor.run() File "twisted/internet/base.py", line 1048, in run self.mainLoop() File "twisted/internet/base.py", line 1057, in mainLoop self.runUntilCurrent() File "twisted/internet/base.py", line 705, in runUntilCurrent call.func(*call.args, **call.kw) File "twisted/internet/defer.py", line 243, in callback self._startRunCallbacks(result) File "twisted/internet/defer.py", line 312, in _startRunCallbacks self._runCallbacks() File "twisted/internet/defer.py", line 328, in _runCallbacks self.result = callback(self.result, *args, **kw) File "twisted/internet/defer.py", line 289, in _continue self.unpause() File "twisted/internet/defer.py", line 285, in unpause self._runCallbacks() }}} The other tricky thing about this failure is that the application code is sitting on the end of the stack: any callback that is attached to the Deferred that {{{wait_until_done}}} returns will run in a low-stack environment. As a result, recursion-depth-exceeded exceptions will be triggered by seemingly innocent application code. Note how the "DONE" number changes as you modify the self.count comparsion value in this example: {{{ #! /usr/bin/python import traceback from twisted.internet import reactor from twisted.internet.defer import Deferred class Poller: count = 0 def wait_until_done(self, ignored=None): self.count += 1 if self.count > 301: # 331 works, 332 fails. return True else: d = Deferred() reactor.callLater(0.0, d.callback, None) d.addCallback(self.wait_until_done) return d p = Poller() def done(res): #traceback.print_stack() print "DONE", len(traceback.extract_stack()) d = p.wait_until_done() d.addCallback(done) reactor.run() }}} When this fails, the traceback that shows up in the logs looks like: {{{ Unhandled error in Deferred: Traceback (most recent call last): File "twisted/internet/defer.py", line 285, in unpause self._runCallbacks() File "twisted/internet/defer.py", line 328, in _runCallbacks self.result = callback(self.result, *args, **kw) File "twisted/internet/defer.py", line 289, in _continue self.unpause() File "twisted/internet/defer.py", line 285, in unpause self._runCallbacks() --- <exception caught here> --- File "twisted/internet/defer.py", line 328, in _runCallbacks self.result = callback(self.result, *args, **kw) File "twisted/internet/defer.py", line 289, in _continue self.unpause() File "twisted/internet/defer.py", line 285, in unpause self._runCallbacks() File "twisted/internet/defer.py", line 344, in _runCallbacks self.result = failure.Failure() File "twisted/python/failure.py", line 265, in __init__ parentCs = reflect.allYourBase(self.type) File "twisted/python/reflect.py", line 542, in allYourBase accumulateBases(classObj, l, baseClass) File "twisted/python/reflect.py", line 550, in accumulateBases accumulateBases(base, l, baseClass) File "twisted/python/reflect.py", line 550, in accumulateBases accumulateBases(base, l, baseClass) File "twisted/python/reflect.py", line 550, in accumulateBases accumulateBases(base, l, baseClass) File "twisted/python/reflect.py", line 550, in accumulateBases accumulateBases(base, l, baseClass) exceptions.RuntimeError: maximum recursion depth exceeded }}} == Combinations == Note that these two problems can interact. Each ready-to-fire callback attached to a single Deferred uses 6 stack frames, and each chained callback uses 3 stack frames. If X*6+Y*3 > 1000, the code will fail. == Solutions == For problem one, the requirement is that Deferreds never wind up with more than 166 callbacks that are ready to fire. In other words, there must be at least one not-ready-to-fire Deferred in each span of 166 callbacks. One way to accomplish this is to have every 100th call return {{{foolscap.eventual.fireEventually(result)}}} instead of {{{defer.succeed(result)}}}. Having every call do this works too, it just slows things down a bit. (note that the reactor must be running for fireEventually to work) {{{ def fire(res, which): return defer.fireEventually(None) d = defer.Deferred() for i in range(170): d.addCallback(fire, i) }}} For problem two, the requirement is that the depth of the tail-recursion chain not exceed 331 cycles, minus some room for the code you're eventually going to attach to the end. One way to accomplish this is to have every 300th call (or every single call, if you are willing to accept the slowdown) add an additional {{{fireEventually}}} to break up the stack. {{{ def wait_until_done(self, ignored=None): self.count += 1 if self.count > 301: # 331 works, 332 fails. return True else: d = Deferred() reactor.callLater(0.0, d.callback, None) d.addCallback(self.wait_until_done) d.addCallback(lambda res: fireEventually(res)) return d }}}

It's always something weird. This time, I took notes. I offer these hints to help future searchers find a starting point in their own debugging efforts.
And as a followup (since the problem I encountered today happened to be a third case): The first step to tracking down these problems is to temporarily apply the following patch to your twisted/internet/defer.py: Index: twisted/internet/defer.py =================================================================== --- twisted/internet/defer.py (revision 24958) +++ twisted/internet/defer.py (working copy) @@ -325,6 +325,12 @@ try: self._runningCallbacks = True try: + if len(traceback.extract_stack()) > 900: + print "running", len(traceback.extract_stack()) + traceback.print_stack() + print "running", len(traceback.extract_stack()) + import os + os.abort() self.result = callback(self.result, *args, **kw) finally: self._runningCallbacks = False @@ -337,6 +343,12 @@ # self.callbacks until it is empty, then return here, # where there is no more work to be done, so this call # will return as well. + if len(traceback.extract_stack()) > 900: + print "chaining", len(traceback.extract_stack()) + traceback.print_stack() + print "chaining", len(traceback.extract_stack()) + import os + os.abort() self.pause() self.result.addBoth(self._continue) break That will let you know when the stack is getting close to exhaustion. By looking at the trace that it prints out, you can find out what other code to investigate. It is then useful to add the same traceback.extract_stack() -using instrumentation to that code. The two problems I described in my previous message were confined to the methods of Deferred: even though the problems were set up by my application code, the actual cycle/loop was entirely inside defer.py . The third problem (that I just finished debugging) had a cycle that passed through my own application code. In this case, the troublesome class looked like: class ConcurrencyLimiter: """I implement a basic concurrency limiter. Add work to it in the form of (callable, args, kwargs) tuples. No more than LIMIT callables will be outstanding at any one time. """ def __init__(self, limit=10): self.limit = limit self.pending = [] self.active = 0 def add(self, cb, *args, **kwargs): d = defer.Deferred() task = (cb, args, kwargs, d) self.pending.append(task) self.maybe_start_task() return d def maybe_start_task(self): if self.active >= self.limit: return if not self.pending: return (cb, args, kwargs, done_d) = self.pending.pop(0) self.active += 1 d = defer.maybeDeferred(cb, *args, **kwargs) d.addBoth(self._done, done_d) def _done(self, res, done_d): self.active -= 1 eventually(done_d.callback, res) self.maybe_start_task() (you can safely ignore the eventually() call there.. that done_d callback was not involved in this problem) In this case, I had a Limiter instance with somewhere around 200 items in the self.pending queue. All of those items were immediate functions: the call to defer.maybeDeferred returns a Deferred that was already in the 'fired' state. That means the d.addBoth() fires the callback right away, synchronously, leading to a recursive cycle that looked like: self.maybe_start_task() d.addBoth(self._done, done_d) Deferred.addCallbacks(self._done,self._done) Deferred._continue self._done() self.maybe_start_task() Giving 5 frames per cycle, so 200 items is enough to hit the 1000-frame default recursion limit. As before, the fix was to break up the stack by using Foolscap's eventual-send operation: def _done(self, res, done_d): self.active -= 1 eventually(done_d.callback, res) eventually(self.maybe_start_task) hope someone eventually (hah!) finds this useful, -Brian

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi,
Your writeup is a very clear entry into the "Twisted documenter of the year award". IMHO the whole writeup should be added to the Twisted documentation right away. Great work! Regards, Tarjei
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJHvpeYVRKCnSvzfIRAg8hAKCXkvMdS2ZybgbA2vQM8N/iJTOh+QCdH7vL bEB330ANcqm6HyZjuukpWcI= =79qM -----END PGP SIGNATURE-----

On 04:35 pm, tarjei@nu.no wrote:
While this is an excellent writeup of a problem, and Brian definitely deserves much praise for doing it with such thoroughness and depth, I don't think we should do that ;). I've reopened an old ticket about this problem which was closed because the specific proposed fix didn't really work. Ideally, Deferred just shouldn't have this problem. If we can't eliminate the problem entirely, then we can at least add a more useful error message which can explain how you can start debugging. The ticket in question (and my comment on it) is here: http://twistedmatrix.com/trac/ticket/411#comment:12 If you'd like to add a link from the FAQ, or some other more informal resource, please feel free. However, this is not something that we should have permanently enshrined as official documentation. It's an unfortunate workaround for a problem which should really just be fixed.

Yeah, when I last looked into this (a couple years ago), I figured that the Deferred doesn't have enough information to safely optimize out the tail-call. You never know who else might have a handle on the Deferred and might add a new callback to it. It once occurred to me that it might be easier to do this safely if Deferred were broken up into two pieces (like E's Promise/Resolver pair: basically one side would get .callback and .errback, while the other side would get .addCallbacks/etc), but I didn't pursue that thought very far. Using an eventual-send is unfortunate but correct (in that it will reliably avoid the problem, but it's probably a noticable performance hit to blow away the entire stack for each call). An unwise-but-less-unfortunate approach would be to use an eventual-send only when it appears necessary, as in the following strawman: class Deferred: def _continue(self, result): self.result = result if len(traceback.extract_stack()) > sys.getrecursionlimit() - 100: eventually(self.unpause) else: self.unpause() It might be enough to have Defer._runCallbacks() look to see if the result of callback() is a recursion-depth-exceeded RuntimeError and do something special in response to it. Zooko showed me some code that would temporarily raise sys.setrecursionlimit() so that the error could be Failure-ized properly.. maybe that would be enough. A lot of the frustration cause by this sort of problem is that the Failure-rendering code runs out of stack space too. cheers, -Brian

On Mon, 17 Nov 2008 15:49:41 -0800, Brian Warner <warner@lothar.com> wrote:
Of course, the best response to this would be an implementation of the iterative version of _runCallbacks. However, I do think it is possible to get rid of this recursion. It doesn't really matter who else might have a reference to either Deferred involved. The new frame going onto the stack is just another Deferred._runCallbacks (unless a subclass overrides it, but DeferredList is the only subclass in Twisted, and we should really deprecated it, and continue to discourage people from subclassing Deferred, and _runCallbacks is private anyway so there). The recurser (ie, the Deferred._runCallbacks doing the `self.result.addBoth( self._continue)´ knows how the recursee (ie, the Deferred having addBoth called on it) behaves - just like itself. The obvious transformation (inlining a bunch of code from outside of _runCallbacks into _runCallbacks) will result in something that's really ugly, but it should work. And I think there is probably an approach that's less ugly, too. This addresses only one of the problems you raised, but it's the one I think Glyph was talking about eliminating by changing the implementation of Deferred. It's possible there's a way to remove the other one with an implementation change to Deferred as well (but it's not as clear to me what that change is yet). However, it's much easier to avoid that one by writing code in a slightly different way. eventual-send is one different way, but there are also other more efficient approaches which are also always correct. These generally take the form of avoiding creating a giant stack of Deferreds in the first place by only changing each Deferred which would have been "interior" on that stack to the one immediately beneath it and chaining the bottom directly to the top. Jean-Paul

Just to add my 2cts: a quick solution to the problem is using stackless and setting sys.setrecursionlimit(sys.maxint). Your test code runs no problem with : for i in range(100000): d.addCallback(fire, i) : on a lousy Athlon dualcore with stackless compiled in: Python 2.5.2 Stackless 3.1b3 060516 (release25-maint, Sep 26 2008, 10:22:13) [MSC v.1310 32 bit (Intel)] on win32 For more than a year now I'm using stackless for all my Python projects without resorting to tasklets and the stuff stackless is really aiming at. I do so because getting rid of the C-stack is a major point in modern language design and implementation and in my opinion Python lags somewhat in this particular area. It is of course true that one must be able to rebuild all the C based stuff you're using in a project but doing so is and was always at the center of my projects. HTH, Werner Brian Warner wrote:

It's always something weird. This time, I took notes. I offer these hints to help future searchers find a starting point in their own debugging efforts.
And as a followup (since the problem I encountered today happened to be a third case): The first step to tracking down these problems is to temporarily apply the following patch to your twisted/internet/defer.py: Index: twisted/internet/defer.py =================================================================== --- twisted/internet/defer.py (revision 24958) +++ twisted/internet/defer.py (working copy) @@ -325,6 +325,12 @@ try: self._runningCallbacks = True try: + if len(traceback.extract_stack()) > 900: + print "running", len(traceback.extract_stack()) + traceback.print_stack() + print "running", len(traceback.extract_stack()) + import os + os.abort() self.result = callback(self.result, *args, **kw) finally: self._runningCallbacks = False @@ -337,6 +343,12 @@ # self.callbacks until it is empty, then return here, # where there is no more work to be done, so this call # will return as well. + if len(traceback.extract_stack()) > 900: + print "chaining", len(traceback.extract_stack()) + traceback.print_stack() + print "chaining", len(traceback.extract_stack()) + import os + os.abort() self.pause() self.result.addBoth(self._continue) break That will let you know when the stack is getting close to exhaustion. By looking at the trace that it prints out, you can find out what other code to investigate. It is then useful to add the same traceback.extract_stack() -using instrumentation to that code. The two problems I described in my previous message were confined to the methods of Deferred: even though the problems were set up by my application code, the actual cycle/loop was entirely inside defer.py . The third problem (that I just finished debugging) had a cycle that passed through my own application code. In this case, the troublesome class looked like: class ConcurrencyLimiter: """I implement a basic concurrency limiter. Add work to it in the form of (callable, args, kwargs) tuples. No more than LIMIT callables will be outstanding at any one time. """ def __init__(self, limit=10): self.limit = limit self.pending = [] self.active = 0 def add(self, cb, *args, **kwargs): d = defer.Deferred() task = (cb, args, kwargs, d) self.pending.append(task) self.maybe_start_task() return d def maybe_start_task(self): if self.active >= self.limit: return if not self.pending: return (cb, args, kwargs, done_d) = self.pending.pop(0) self.active += 1 d = defer.maybeDeferred(cb, *args, **kwargs) d.addBoth(self._done, done_d) def _done(self, res, done_d): self.active -= 1 eventually(done_d.callback, res) self.maybe_start_task() (you can safely ignore the eventually() call there.. that done_d callback was not involved in this problem) In this case, I had a Limiter instance with somewhere around 200 items in the self.pending queue. All of those items were immediate functions: the call to defer.maybeDeferred returns a Deferred that was already in the 'fired' state. That means the d.addBoth() fires the callback right away, synchronously, leading to a recursive cycle that looked like: self.maybe_start_task() d.addBoth(self._done, done_d) Deferred.addCallbacks(self._done,self._done) Deferred._continue self._done() self.maybe_start_task() Giving 5 frames per cycle, so 200 items is enough to hit the 1000-frame default recursion limit. As before, the fix was to break up the stack by using Foolscap's eventual-send operation: def _done(self, res, done_d): self.active -= 1 eventually(done_d.callback, res) eventually(self.maybe_start_task) hope someone eventually (hah!) finds this useful, -Brian

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi,
Your writeup is a very clear entry into the "Twisted documenter of the year award". IMHO the whole writeup should be added to the Twisted documentation right away. Great work! Regards, Tarjei
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJHvpeYVRKCnSvzfIRAg8hAKCXkvMdS2ZybgbA2vQM8N/iJTOh+QCdH7vL bEB330ANcqm6HyZjuukpWcI= =79qM -----END PGP SIGNATURE-----

On 04:35 pm, tarjei@nu.no wrote:
While this is an excellent writeup of a problem, and Brian definitely deserves much praise for doing it with such thoroughness and depth, I don't think we should do that ;). I've reopened an old ticket about this problem which was closed because the specific proposed fix didn't really work. Ideally, Deferred just shouldn't have this problem. If we can't eliminate the problem entirely, then we can at least add a more useful error message which can explain how you can start debugging. The ticket in question (and my comment on it) is here: http://twistedmatrix.com/trac/ticket/411#comment:12 If you'd like to add a link from the FAQ, or some other more informal resource, please feel free. However, this is not something that we should have permanently enshrined as official documentation. It's an unfortunate workaround for a problem which should really just be fixed.

Yeah, when I last looked into this (a couple years ago), I figured that the Deferred doesn't have enough information to safely optimize out the tail-call. You never know who else might have a handle on the Deferred and might add a new callback to it. It once occurred to me that it might be easier to do this safely if Deferred were broken up into two pieces (like E's Promise/Resolver pair: basically one side would get .callback and .errback, while the other side would get .addCallbacks/etc), but I didn't pursue that thought very far. Using an eventual-send is unfortunate but correct (in that it will reliably avoid the problem, but it's probably a noticable performance hit to blow away the entire stack for each call). An unwise-but-less-unfortunate approach would be to use an eventual-send only when it appears necessary, as in the following strawman: class Deferred: def _continue(self, result): self.result = result if len(traceback.extract_stack()) > sys.getrecursionlimit() - 100: eventually(self.unpause) else: self.unpause() It might be enough to have Defer._runCallbacks() look to see if the result of callback() is a recursion-depth-exceeded RuntimeError and do something special in response to it. Zooko showed me some code that would temporarily raise sys.setrecursionlimit() so that the error could be Failure-ized properly.. maybe that would be enough. A lot of the frustration cause by this sort of problem is that the Failure-rendering code runs out of stack space too. cheers, -Brian

On Mon, 17 Nov 2008 15:49:41 -0800, Brian Warner <warner@lothar.com> wrote:
Of course, the best response to this would be an implementation of the iterative version of _runCallbacks. However, I do think it is possible to get rid of this recursion. It doesn't really matter who else might have a reference to either Deferred involved. The new frame going onto the stack is just another Deferred._runCallbacks (unless a subclass overrides it, but DeferredList is the only subclass in Twisted, and we should really deprecated it, and continue to discourage people from subclassing Deferred, and _runCallbacks is private anyway so there). The recurser (ie, the Deferred._runCallbacks doing the `self.result.addBoth( self._continue)´ knows how the recursee (ie, the Deferred having addBoth called on it) behaves - just like itself. The obvious transformation (inlining a bunch of code from outside of _runCallbacks into _runCallbacks) will result in something that's really ugly, but it should work. And I think there is probably an approach that's less ugly, too. This addresses only one of the problems you raised, but it's the one I think Glyph was talking about eliminating by changing the implementation of Deferred. It's possible there's a way to remove the other one with an implementation change to Deferred as well (but it's not as clear to me what that change is yet). However, it's much easier to avoid that one by writing code in a slightly different way. eventual-send is one different way, but there are also other more efficient approaches which are also always correct. These generally take the form of avoiding creating a giant stack of Deferreds in the first place by only changing each Deferred which would have been "interior" on that stack to the one immediately beneath it and chaining the bottom directly to the top. Jean-Paul

Just to add my 2cts: a quick solution to the problem is using stackless and setting sys.setrecursionlimit(sys.maxint). Your test code runs no problem with : for i in range(100000): d.addCallback(fire, i) : on a lousy Athlon dualcore with stackless compiled in: Python 2.5.2 Stackless 3.1b3 060516 (release25-maint, Sep 26 2008, 10:22:13) [MSC v.1310 32 bit (Intel)] on win32 For more than a year now I'm using stackless for all my Python projects without resorting to tasklets and the stuff stackless is really aiming at. I do so because getting rid of the C-stack is a major point in modern language design and implementation and in my opinion Python lags somewhat in this particular area. It is of course true that one must be able to rebuild all the C based stuff you're using in a project but doing so is and was always at the center of my projects. HTH, Werner Brian Warner wrote:
participants (5)
-
Brian Warner
-
glyph@divmod.com
-
Jean-Paul Calderone
-
tarjei
-
Werner Thie