Re: [Twisted-Python] A pseudo-deferred class that can be canceled
Hi Glyph It's very late here, so I'll limit myself to a few thousand lines of reply.
"Glyph" == Glyph Lefkowitz <glyph@twistedmatrix.com> writes: Glyph> On Jan 6, 2010, at 7:09 AM, Terry Jones wrote:
Glyph> What I mean is, there are a lot of weird little edge-cases in how Glyph> multiple layers of the stack interact when they're dealing with a Glyph> shared Deferred, and if we're The above was truncated. Glyph> However, upon further inspection I think that they key distinction Glyph> between what you've proposed and what I'm talking about is the Glyph> distinction between cancelling *one* layer of the callback chain and Glyph> cancelling *all* layers of the callback chain. Yes, that's right. I nearly made a diagram for people today, but didn't know if anyone would be interested. But here's one way to look at it. In today's deferred world, you have (in general) situations like this: func makes d -> c1 -> c2 -> c3 -> c4 -> c5 -> c6 -> c7 -> client -> c8 -> c9 I.e., the client makes a call, gets its hands on a deferred (which already has zero or more call/errbacks on its chain) and adds its own callbacks. At that point cancellation is very hard. Neither the client, nor the deferred itself, or the original function, can know how to cancel the operation. From the POV of the client and the deferred, that callback chain is just a bunch of indistinguishable functions. My ControllableDeferred class, if used by just the client, makes it possible for the client to cut the link between c7 and c8, either by iself (the client) calling the deferred it receives, which causes c8 to fire/err, or by deactivating it, thereby arranging that c8 is never called. So the ControllableDeferred in some sense introduces a cut point between two callbacks in the original chain. And the cut point is in a sensible logical position because the client added c8 and c9 and can be expected to know what to do to clean them up if it decides it's done waiting for the original d to fire. But the above picture is more uniform than the reality: it hides the fact that the callbacks were added to the deferred in groups (of zero or more). That is, the chain really looks like this: func makes d -> w1 -> w2 -> x1 -> x2 -> y1 -> y2 -> y3 -> client -> c8 -> c9 which is to say that the client in fact called function Y. Y called X. X called W, and W called something that returned a deferred. W then adds w1, and w2 to d and returns it to X. X adds x1 and x2 to d and returns it to Y. Y adds y1-3 and returns it to C. So you can imagine now that we insert a cut point at each logical boundary, and then the cancel information can flow back up the chain and each logical unit presumably knows how to discard / abort etc., whatever it may have in progress. That picture is mainly for clarity. I'm sure you're miles ahead already... Glyph> Your description (elided for brevity's sake) was very helpful. Glyph> You've got resources which your callbacks are consuming by way of Glyph> being "currently outstanding", and you want to be able to free Glyph> *those* resources, without necessarily worrying about However you were going to finish that sentence, I agree :-) I want to free the resources, and I want to be able to get on with whatever it is I'm supposed to be doing.
Yes, agreed. I like the fact that the class is simple and that it deals with the client-side issues, allowing ignoring, timing out, early firing, etc. As you say, the much harder problem remains. But the harder problem is a bit less messy now (at least in my mind): it's "just" cancellation. Responsibilities are cleanly divided by my class - the client takes care of itself, and cancellation has *only* to deal with callbacks placed on a deferred that was generated by what the client called.
Glyph> I don't think that you can completely separate the problems. You Glyph> seem to have a reasonable solution to the problem of one layer of Glyph> the Deferred stack, but once you're trying to deal with multiple Glyph> layers of the stack at once, interactions occur which can be Glyph> difficult to reconcile with the same API, many of which are already Glyph> documented in the ticket's discussion. It may be that there are interactions between W and Y (for example) in my above (2nd) diagram, but I expect that would be infrequent. E.g., W might decide to attach a callback to d after it has been returned to (and added to by) X, Y, etc. That seems to be a problem, but if W were to add those extra callbacks within another logical unit of the callback chain, it would be alerted of the cancellation in the normal fashion (twice). Make sense?
Looked at from this POV, an approach to cancellation would be for code that is able to cancel operations it has begun to also provide a cancel method. One way to think about doing this would be to have the cancel method take a deferred as an argument.
Glyph> This is a *very* interesting idea, although I don't like the API Glyph> that you propose for it. By separating the cancel method from the Glyph> Deferred itself, you remove the ability for a trivial client of that Glyph> Deferred to say "forget about it" without also maintaining a Glyph> reference to the thing that gave it the Deferred in the first place. I agree that's less desirable, but I'm not sure it's a necessary consequence of the approach. Or maybe it is. Today I modified my ControllableDeferred class to allow a cancelFunc argument. The __init__ is a tiny bit more clunky, but it has methods just like the old class, e.g. def callback(self, result): if not self._called: self._called = True if self.cancelFunc: self.cancelFunc(self._calld) defer.Deferred.callback(self, result) def deactivate(self): if not self._called: self._called = True if self.cancelFunc: self.cancelFunc(self._calld) This is just what you suggest - the cancel function is inside the deferred class (my ControllableDeferred is a subclass of Deferred, so that's literally true). The client, receiving an instance of this class, can just say "forget about it" and the cancel goes back to wherever it should go, if anywhere. So you can imagine writing a getPage function (or class) that returns ControllableDeferred instances. Calling the deferred or deactivating it would then result in a HTTPClientFactory instance calling transport.loseConnection.
Something like my class could then hand the deferred back, effectively saying "my client is no longer interested in this deferred. You can call/errback it, or not, it makes no difference to us". If you've done that once, you can do it multiple times - by which I mean that I might write code that's a client of getPage, and getPage is a client of XXX, and XXX is a client of YYY, etc. Each could in turn pass the deferred it got back to the thing that created it.
Glyph> This implies, to me, that the cancellation callback would be better Glyph> passed to addCallbacks(): effectively creating a third callback Glyph> chain going from invoker to responder rather than the other way Glyph> 'round as callbacks and errbacks do. Yes, I like that a lot, at least in a 5:30am superficial kinda way. A key difference between what I'd imagined and what you're suggesting is that in my approach, the cancel call goes directly to the thing (it would need to be a class instance, I suspect) that got the deferred. I.e. from my 2nd diagram, if the client calls cancel (or deactivate, as in the code), then the thing that added y1 to the chain is going to have its cancel method called (or some method that it asked to have called). So the control in a sense jumps back over y3, y2 and y1 to the root of the logical Y section. Your approach passes the signal back up the chain. Most secondary steps, like y3 and y2, will pass the call along without taking any action. But they don't have to, which is good. And the first callback of a logical unit can always do exactly what would have been done in my approach above. I think your approach is better. Glyph> I have stumbled in the direction of this thought a few times already Glyph> but this is the first time I've had a really clear grasp of how it Glyph> would work. Now I can see that each layer of the stack may have its Glyph> own resources that it might want to clean up... previously I thought Glyph> this could be done entirely with errbacks, but in this version, it Glyph> doesn't matter if the base deferred doesn't know how to kick off the Glyph> errback chain: all the resources on the *rest* of the callback chain Glyph> can be cleaned up. Yes. And the logical divisions of the call/errback chain are going to ignore each other in any case. Once a further-down-the-chain function has either called or deactivated the deferred (to put it simply - it's actually not just one deferred, at least in my implementation), it doesn't matter at all what the upstream (earlier) functions do - the result, if any, is not going through. Glyph> I'm going to need to figure out some good values for XXX and YYY Glyph> here in order to truly dispel the fog, though. I'm a bit foggy too. That's why I started playing with getPage to try to use a common example with at least a few levels of processing. But I didn't have time to think about it clearly. I wrote some foggy code, which I wont inflict on you. I'm pretty sure there's a clean solution in here though, that we can get to with a bit more back & forth.
If there's no cancel method, then that's as far as can be gone with canceling.
Glyph> This is one of the really tricky issues that has faced this feature Glyph> all along: what happens when some part of the chain involved doesn't Glyph> know what to do with a canceller? And your solution here seems like Glyph> it may be a very elegant hack: do exactly the same thing as other Glyph> parts of the callback chain. What I mean is: currently, if a Glyph> particular callback pair doesn't have a callback or an errback, the Glyph> behavior is to do nothing and pass the result through. Cancellation Glyph> could do exactly the same thing! Yes, that's great. That's *your* solution, btw :-)
At that point the result is no longer passed because the first ControllableDeferred instance that's involved will effectively snip the link (or send an early result) in the sequence of steps that would originally have been done.
Glyph> Severing the link seems like a problem though; if we do that, then Glyph> introducing any non-cancellation-aware Deferred - or callback, for Glyph> that matter - into a cancellation-aware pipeline will prevent Glyph> cancellations from propagating further up, and there should be no Glyph> reason to do that. Yes, agreed.
And it keeps all code for doing cancellation out of the Deferred class.
Glyph> Why is it that you want to keep the cancellation code out of Glyph> Deferred? It seems very useful to me to have one object that you Glyph> can say "stop" to, without necessarily knowing what's going on above Glyph> it or where it came from. Yes, I guess I didn't want to keep it out of there - especially since I already put it in today.... I guess what I really meant was that I wanted it to be clean / simple, because Deferreds are that way already (once you've spent a couple of years thinking about them).
OK, sorry for so many words. I hope this seems like it's heading in a useful direction. It does to me.
Glyph> Yes, this has been very useful. I hope we can distill this into Glyph> some useful conclusions soon. :) I think we can / will. It should be fairly easy to build an example based on getPage. I badly wanted to today, but we have a ton of stuff going on right now and I forced myself to put this aside for some hours. Terry
participants (1)
-
Terry Jones