[Twisted-Python] What to do when a service fails to start, also, deferred and startService
I've another of my pesky beginner questions. Note that this subject is somewhat covered in the thread started by Matt Goodall in Jan 2006: http://twistedmatrix.com/pipermail/twisted-python/2006-January/012380.html I imagine that it must be common that people write services that don't just simply launch things listening on sockets, but instead need to do a couple of things, one after another, in order to get going and to be ready to provide their service (or multiservice). If you do need to write something like that, it seems the chances are pretty high you're going to be calling code somewhere along the way that returns a deferred. And because the twisted/application/service.py code that calls startService doesn't handle deferreds being returned, this creates a real problem. At least as far as I understand things - which, as usual, may not be very far. If nothing goes wrong with the deferreds that startService is creating (via whatever its calling), then you'll probably get away with things even though your service will not really be up until after the deferreds fire, which can be some time after the code calling startService gets its deferred back (and ignores it). But if something does go wrong, you've got a failure propagating its way down a errback chain, eventually (unless an errback switches you back to the callback chain) popping out the end and causing the reactor to issue an Unhandled Error message. So you can't indicate that the service has failed to start by throwing, because the exception is going to pop harmlessly out the end of the deferred chain as a generic unhandled error and will not cause Twisted to know that the service couldn't start. This all feels quite ironic :-) Twisted leads you coyly into the dark and powerful world of working with and heavily depending on Deferreds. But then, right when you expect it to be there for you, covering your back, it throws up its hands as if to say "What!!? You expect me to deal with you returning a Deferred? You gotta be kidding, sucker." I could follow Moof's approach (last poster in the above thread), but that seems to just pass the problem on to a higher level, where something else is calling startService (or something earlier) and so on up until we reach the topmost point at which something is not allowing/expecting a deferred to come back. Should I track down and subclass all these things? That would seem cruel and unusual punishment to the faithful Deferred user, having to go in and subclass core classes because they don't deal with Deferreds. I could do something dramatic, like call reactor.stop or sys.exit in my errback chain, but those seem completely wrong. Apart from the (remote?) possibility that something other than Twisted plugin code is trying to start my service, it's also anachronistic because it will happen at some unpredictable time after startService has gotten back (and ignored) the deferred and Twisted has moved on (perhaps even to start other services). Terry
On 27 Nov, 05:16 pm, terry@jon.es wrote:
I imagine that it must be common that people write services that don't just simply launch things listening on sockets, but instead need to do a couple of things, one after another, in order to get going and to be ready to provide their service (or multiservice).
I can't speak to how common it is, but I don't do it and I've actually seen it fairly rarely; although I have heard people asking about it a number of times. For me, baroque and elaborate start-up dances are a code smell. Services should be as independent as possible. Of course, sometimes some kind of initialization conversation is unavoidable, but I do like to try to keep it as short as possible.
If you do need to write something like that, it seems the chances are pretty high you're going to be calling code somewhere along the way that returns a deferred. And because the twisted/application/service.py code that calls startService doesn't handle deferreds being returned, this creates a real problem. At least as far as I understand things - which, as usual, may not be very far.
I think you're misunderstanding what a "service" is. The word is, perhaps, a bit to lofty for its humble job. A service is just an event notification mechanism that tells you when it's time to start up, and when it's time to shut down. I can understand why it would be attractive to misunderstand in this way, though: IService doesn't do very much, you have requirements that it doesn't cover, and if it were the thing you understand it to be then it would cover those requirements. I'm sure that would be nicer for you :). This might seem a bit inconsistent, since stopService uses the return of a Deferred. However, this is for a very specific reason, not a generalized error-handling case: you may need to prevent the *rest* of the system (specifically, the reactor) from completely shutting down until you've managed to cleanly shut down whatever you're trying to shut down on potentially remote systems. startService has no such problem though; the service subsystem has told you "It's time to start up!" - its job is done, and the reactor isn't going away as part of service startup, so it's your responsibility as an application author to make sure your other dependencies are properly initialized.
But if something does go wrong, you've got a failure propagating its way down a errback chain, eventually (unless an errback switches you back to the callback chain) popping out the end and causing the reactor to issue an Unhandled Error message. So you can't indicate that the service has failed to start by throwing, because the exception is going to pop harmlessly out the end of the deferred chain as a generic unhandled error and will not cause Twisted to know that the service couldn't start.
The key question here is: indicate to whom? If you want to indicate it to some other object, well, try:except: or addErrback and call a method on that object. Nothing magic about it. There is no general-purpose object in Twisted who would be interested in any and all kinds of failures. Except, of course, the logging system, which, as you say, has already been told about this.
This all feels quite ironic :-) Twisted leads you coyly into the dark and powerful world of working with and heavily depending on Deferreds. But then, right when you expect it to be there for you, covering your back, it throws up its hands as if to say "What!!? You expect me to deal with you returning a Deferred? You gotta be kidding, sucker."
This begs the question, again, of what does it mean to "deal with" returning a Deferred? Pause the service startup chain? As exarkun noted in the thread you referenced, we *can't* stop and do that in privilegedStartService, so it would be a bit asymmetric to do so in startService. In what way would you expect the service mechanism to "deal with" returning a Deferred? Stop starting other services? Print out some different log message? The options I can come up with are generally undesirable. Service order is somewhat arbitrary. If you have a debugging service (like manhole) that happens to start up after your failed startup, then you won't be able to log in and inspect your failed-to-start service if it fails. If you amend the log message in some way, chances are good that you will remove information (stack frames) that would be useful for debugging. These are equally good reasons not to pause the service startup chain in MultiService, too: one service should be able to inspect another to see why it's hung.
I could follow Moof's approach (last poster in the above thread), but that seems to just pass the problem on to a higher level, where something else is calling startService (or something earlier) and so on up until we reach the topmost point at which something is not allowing/expecting a deferred to come back. Should I track down and subclass all these things? That would seem cruel and unusual punishment to the faithful Deferred user, having to go in and subclass core classes because they don't deal with Deferreds.
Indeed. This problem is left to a higher level because it is a higher level problem. There is certainly a case to be made that the higher level should be somewhere in Twisted itself, but let's not complicate IService further. IService is a very, very simple interface. If you want to respond to failures from startService (deferred failures, exceptions, or whatever else) in a useful way, then you can write your own implementation of it which manages startup order, keeps track of dependencies, and maintains a state machine that handles stopService appropriately if called in mid- startup. I don't think that having to implement an interface with 6 methods on it could be considered "cruel and unusual". If you think so you may want to investigate options other than Twisted: you will frequently be expected to implement interfaces with methods on them ;-). There's no need to "track down and subclass" lots of things. Your IService wants the things that it contains to have a richer interface which allows for error handling, dependencies, and propagation, so simply write a single wrapper for simpler IService objects that expands the interface to do the other things that you're interested in. This all strikes me as totally straightforward and easy, and I don't think I'm any kind of super-genius for being able to write a few Python classes that call a few simple start/stop methods in the order that I want them to run in :).
I could do something dramatic, like call reactor.stop or sys.exit in my errback chain, but those seem completely wrong. Apart from the (remote?) possibility that something other than Twisted plugin code is trying to start my service, it's also anachronistic because it will happen at some unpredictable time after startService has gotten back (and ignored) the deferred and Twisted has moved on (perhaps even to start other services).
Doing either of those things would definitely be wrong. There's no reason to sys.exit or reactor.stop if your application can't start up, unless your management system specifically calls for such a thing. In the future, even the Twisted plugin code might be starting some things in addition to your application. As I mentioned above, a good reason to do that is to perform diagnostics on failed startups :).
participants (2)
-
glyph@divmod.com
-
Terry Jones