[Twisted-Python] Twisted, twistd and logging

Hi, all. I'm working on a new system to be deployed in our production environment, and having already decided on using Twisted, I'm now trying to figure out how to cleanly and reliably deploy it. I have many questions, but right now I just want to work through the issue of logging. In our existing systems, our code writes messages to syslog, with 'facility' set to LOG_LOCAL0 and 'priority' set according to the severity of the event - debug, info, warn or alert. Anything logged with 'alert' severity causes our monitoring to contact the operator on duty, so they can look into it. For the sake of our operators and system administrators, I'd prefer to use as much of the existing monitoring infrastructure as possible. On the other hand, the logging code we use at the moment is crufty old home-grown code, and I'd much prefer to use a logging system maintained by somebody else. My first thought was to use the 'logging' module in the Python 2.3 standard library. It covers all the syslog functionality I want (and more), and it's pretty easy to use. Unfortunately, using Python's logging module with Twisted is hard for two reasons: twistd and trial. Both go to great lengths to Do The Right Thing with logging, and both totally ignore the Python logging mechanism. My second thought was to use twisted.python.log, but that brings its own set of problems: 1. twisted.python.syslog isn't very... complete. For starters, it doesn't let you specify the syslog facility, and even if it did, twistd would need to be extended to support it. 2. As has been discussed in the list archives, t.p.log doesn't support log-levels. While I can see glyph's point about the arbitraryness of fixed log-levels, I really do want to distinguish between "forensic record of ordinary operation" and "Danger, Will Robinson!" I think the best way of solving my problems would be to: a) update t.p.syslog so that it supports a 'facility' parameter. b) update t.p.syslog so that log-messages with 'isError' true are logged with the LOG_ALERT priority rather than the default. c) update the Unix version of twistd to support a 'syslogFacility' command-line parameter. Would such a patch be accepted? If necessary, I could replace (b) with a syslog-based LogObserver in my own code, but I'm keen to have (a) and (c) in the Twisted code-base.

On Jun 19, 2008, at 2:21 AM, Tim Allen wrote:
My second thought was to use twisted.python.log, but that brings its own set of problems:
1. twisted.python.syslog isn't very... complete. For starters, it doesn't let you specify the syslog facility, and even if it did, twistd would need to be extended to support it.
It's true that the syslog support is very experimental at this point, but what would need to be changed about twistd to support the necessary development? I think the most likely scenario is that the standard logging code assumes logging will return immediately, but I feel like this could be worked around by having some kind of log queue that gets sent to syslog by some background thread (or better yet, a callLater pattern).
2. As has been discussed in the list archives, t.p.log doesn't support log-levels. While I can see glyph's point about the arbitraryness of fixed log-levels, I really do want to distinguish between "forensic record of ordinary operation" and "Danger, Will Robinson!"
t.p.log doesn't implicitly support log levels, but you can add arbitrary keywords to your call to log.msg. Then your observer can take care of routing log messages to the right places.
I think the best way of solving my problems would be to:
a) update t.p.syslog so that it supports a 'facility' parameter. b) update t.p.syslog so that log-messages with 'isError' true are logged with the LOG_ALERT priority rather than the default. c) update the Unix version of twistd to support a 'syslogFacility' command-line parameter.
Would such a patch be accepted? If necessary, I could replace (b) with a syslog-based LogObserver in my own code, but I'm keen to have (a) and (c) in the Twisted code-base.
I can't speak to whether such a patch would be accepted, but I know I have a couple of projects I've wanted to use syslogging for, and it would be great to have a stable solution in Twisted. However, the core dev team is fairly well inundated, so while you're working on a patch/ review, you might consider investigating another option in parallel. If you package your code separately and join the Twisted Community Code project on Launchpad, you can get your code out there and in front of eyeballs while you work on getting a patch accepted into Twisted. If you can draw a reasonable amount of user attention, this may help with both testing and review, and make it easier to get your patch into Twisted proper. -phil

Phil Christensen <phil@bubblehouse.org> wrote:
On Jun 19, 2008, at 2:21 AM, Tim Allen wrote:
My second thought was to use twisted.python.log, but that brings its own set of problems:
1. twisted.python.syslog isn't very... complete. For starters, it doesn't let you specify the syslog facility, and even if it did, twistd would need to be extended to support it.
It's true that the syslog support is very experimental at this point, but what would need to be changed about twistd to support the necessary development?
As I understood it, twistd's philosophy was that logging-configuration was a deployment issue and hence ought to be configured by tweaking twistd command-line parameters rather than pre-configuring the application to be deployed. twistd already has very simple logging configuration options in the form of "--logfile" and "--syslog"; I intended to add a "--syslog-facility=" command-line option [0]. Having skim-read ticket 638 that JP pointed to in another reply, now I'm not so sure. It's kind of difficult to for me to understand the current state of play, especially since that ticket predates the merge of seemingly-related ticket 3052, but it *seems* to be a battle between the camps of "twistd's logging is limited, it should be infinitely extensible" and "twistd's logging is limited, applications should be allowed to override it". Personally, I think I lean more toward "it should be infinitely extensible" side - but right now I don't really care about the general case of extensibility, I just want more flexible syslog support. At any rate, ticket 638 and other related tickets seem to be high-level conceptual arguments, and I don't feel I have the domain-knowledge required to comment intelligibly on them.
I think the most likely scenario is that the standard logging code assumes logging will return immediately, but I feel like this could be worked around by having some kind of log queue that gets sent to syslog by some background thread (or better yet, a callLater pattern).
I'm not sure what you're getting at here. Python's syslog module calls syslog(3), which talks to syslogd(8). I don't know whether either or any of those block waiting for the log message to be delivered; I would hazard a guess that they don't.
I think the best way of solving my problems would be to:
a) update t.p.syslog so that it supports a 'facility' parameter. b) update t.p.syslog so that log-messages with 'isError' true are logged with the LOG_ALERT priority rather than the default. c) update the Unix version of twistd to support a 'syslogFacility' command-line parameter.
Would such a patch be accepted? If necessary, I could replace (b) with a syslog-based LogObserver in my own code, but I'm keen to have (a) and (c) in the Twisted code-base.
I can't speak to whether such a patch would be accepted, but I know I have a couple of projects I've wanted to use syslogging for, and it would be great to have a stable solution in Twisted. However, the core dev team is fairly well inundated, so while you're working on a patch/ review, you might consider investigating another option in parallel.
If you package your code separately and join the Twisted Community Code project on Launchpad, you can get your code out there and in front of eyeballs while you work on getting a patch accepted into Twisted. If you can draw a reasonable amount of user attention, this may help with both testing and review, and make it easier to get your patch into Twisted proper.
Thanks for the advice - I'm not sure that patches to twistd could be packaged separately, but I'll keep the possibility in mind in case I think of anything. Tim Allen [0] I'm not familiar with t.p.usage; if it were possible to support syntax like "--syslog=LOG_LOCAL0" while still supporting the "--syslog" syntax in existing releases, that'd be ideal.

On 02:13 am, tim@commsecure.com.au wrote:
Having skim-read ticket 638 that JP pointed to in another reply, now I'm not so sure. It's kind of difficult to for me to understand the current state of play, especially since that ticket predates the merge of seemingly-related ticket 3052, but it *seems* to be a battle between the camps of "twistd's logging is limited, it should be infinitely extensible" and "twistd's logging is limited, applications should be allowed to override it". Personally, I think I lean more toward "it should be infinitely extensible" side - but right now I don't really care about the general case of extensibility, I just want more flexible syslog support.
Well, let me first say that your understanding of the issues here is remarkably deep and nuanced, despite your modesty :). I hope that you'll stick around to help us take the whole logging system in Twisted to the next level... I don't believe there are actually different "camps" here, and there isn't a lot of disagreement. At least, I believe all of the core developers have essentially the same perspective, with differences in emphasis, not content. There are two systems here. Twisted's logging (i.e. twisted.python.log), and twistd's logging (twisted.internet._twistd_unix, twisted.internet._twistw, twisted.python.syslog, among other things). Twisted's logging, while it could certainly be improved, is in reasonable shape. It's extremely simple, and there is a clear and (mostly) well-specified "do anything you want" integration point: addObserver. While there are some tweaks I'd like to see here, I don't think it's what we're talking about. By contrast, twistd's logging, as the module names I mentioned above may indicate, is a squirrely mess of private and undocumented code, ad-hoc relationships, and so on. The ticket which I believe most clearly explains how to fix this issue is #2751. The question of "how do we tell where twistd thinks the logs go" is really just a tiny microcosm of "how do we tell what twistd thinks is going on around here". There is no clear, uniform surface presented to affect twistd's configuration from anywhere, not even internally; let alone in user code. The problem with the way you phrased the disagreement is the extreme vagueness of the word "application". It really just means "some code". The reason you have picked up on opposition to the "applications should be allowed to override it" idea is that there is a tendency, when creating applications, to customize every aspect of the framework just because you can; to create a monolithic, non-cooperative chunk of code that just has a mainpoint and sets everything up for you. An application, as in some code that uses Twisted to provide a network service, should be written in such a way that it is relatively agnostic about logging. Some users might want to set it up to use syslog, some might want to get it in the logging module; some might want something really weird like Growl notifications or dbus messages. Ergo, applications should not generally touch logging, no matter what our logging APIs look like. This is what I'm talking about with the "it invites abuse" comment near the bottom of #638. However, some "applications" *are* logging. For example, if you wanted to do write some crazy log monitor that integrates with twistd process, you need two things: the "application" needs to allow the framework to handle logging, not muck with it itself (i.e. the thing I just said) and the framework needs to provide an interface for *your* application (i.e. the log monitor) to get in there and call some nice, well-documented APIs to reconfigure the way twistd has set up its logging. The first application that needs to do this is twistd itself. The internal APIs of twistd need to be organized such that it is easy to maintain and extend - for example, so that you could do the extension that you are proposing with syslog. Once these APIs are tested, documented, and clearly factored, adding a plugin system to allow users to replace components of the logging system should be relatively straightforward. Work is underway to do all of this; many of the tickets that we're talking about here have already have work underway.
At any rate, ticket 638 and other related tickets seem to be high-level conceptual arguments, and I don't feel I have the domain-knowledge required to comment intelligibly on them.
I hope that this summary has improved your understanding somewhat. Don't let the depth of the issues here scare you; the best way to resolve major architectural headaches like this is one step at a time, responding to one specific use-case at a time. So, I would recommend that you file a ticket if you haven't already, and simply add syslog facility support to twistd by adding a --syslog-facility command-line option (no, usage.Options won't let you have --foo and --foo=bar options at the same time, sorry...) and a _syslogFacility attribute to the UnixAppLogger object added by the fix to #3052. You would of course need to add this to twisted.python.syslog as well, but I think the way to do that should be pretty straightforward. If you feel the need to get fancy and add more features to t.p.syslog, make the factoring of the log configuration object finer-grained, or add a plugin system for twistd for log sinks so you can write your own syslog support, feel free! However, ambitious and possibly controversial refactorings like that are a lot easier to talk about (and get done) when we can look at how they will improve the implementation of some existing feature, not for a speculative benefit. So in any event you probably want to do the simple syslog facility addition first. Also, congratulations on volunteering to maintain Twisted's log subsystem! If you would like to decline, please see David Reid about fighting a bear on your way out. ;-).

glyph@divmod.com wrote:
On 02:13 am, tim@commsecure.com.au wrote:
Having skim-read ticket 638 that JP pointed to in another reply, now I'm not so sure. It's kind of difficult to for me to understand the current state of play, especially since that ticket predates the merge of seemingly-related ticket 3052, but it *seems* to be a battle between the camps of "twistd's logging is limited, it should be infinitely extensible" and "twistd's logging is limited, applications should be allowed to override it". Personally, I think I lean more toward "it should be infinitely extensible" side - but right now I don't really care about the general case of extensibility, I just want more flexible syslog support.
Well, let me first say that your understanding of the issues here is remarkably deep and nuanced, despite your modesty :).
Thank you. :)
I don't believe there are actually different "camps" here, and there isn't a lot of disagreement. At least, I believe all of the core developers have essentially the same perspective, with differences in emphasis, not content.
There are two systems here. Twisted's logging (i.e. twisted.python.log), and twistd's logging (twisted.internet._twistd_unix, twisted.internet._twistw, twisted.python.syslog, among other things). Twisted's logging, while it could certainly be improved, is in reasonable shape. It's extremely simple, and there is a clear and (mostly) well-specified "do anything you want" integration point: addObserver. While there are some tweaks I'd like to see here, I don't think it's what we're talking about.
True; if I had to make a complaint about t.p.log, it would be that this system for arbitrary event-sources to communicate with arbitrary event-observers needs a well-defined interface to facilitate communication. Currently, the only things an event-observer can rely on are the 'isError' and 'system' fields being present, and probably 'message' and 'time'. Any observer winds up either depending on extra fields set by a specific event-source (and hoping that no other event-source happens to use the same field-names for different purposes), or having to treat every event as a raw text event - defeating the point of logging complicated events in the first place. However! That's probably an entirely separate can of worms from the actual problem I'm currently supposed to be working on, so I shall leave it there for now. :)
The ticket which I believe most clearly explains how to fix this issue is #2751. The question of "how do we tell where twistd thinks the logs go" is really just a tiny microcosm of "how do we tell what twistd thinks is going on around here". There is no clear, uniform surface presented to affect twistd's configuration from anywhere, not even internally; let alone in user code.
A kind of document-object-model of application deployment configuration, then?
The first application that needs to do this is twistd itself. The internal APIs of twistd need to be organized such that it is easy to maintain and extend - for example, so that you could do the extension that you are proposing with syslog. Once these APIs are tested, documented, and clearly factored, adding a plugin system to allow users to replace components of the logging system should be relatively straightforward.
Work is underway to do all of this; many of the tickets that we're talking about here have already have work underway.
Your most recent comment on #2571 is a lot more optimistic, it sounds like twistd has most of the stuff it needs already. If I have some free time, I might poke through the twistd code to see if I can understand the architectural issues better.
At any rate, ticket 638 and other related tickets seem to be high-level conceptual arguments, and I don't feel I have the domain-knowledge required to comment intelligibly on them.
I hope that this summary has improved your understanding somewhat. Don't let the depth of the issues here scare you; the best way to resolve major architectural headaches like this is one step at a time, responding to one specific use-case at a time. So, I would recommend that you file a ticket if you haven't already, and simply add syslog facility support to twistd by adding a --syslog-facility command-line option (no, usage.Options won't let you have --foo and --foo=bar options at the same time, sorry...) and a _syslogFacility attribute to the UnixAppLogger object added by the fix to #3052. You would of course need to add this to twisted.python.syslog as well, but I think the way to do that should be pretty straightforward.
Filed as #3300, with patches and explanations.
Also, congratulations on volunteering to maintain Twisted's log subsystem! If you would like to decline, please see David Reid about fighting a bear on your way out. ;-).
Well, we'll see how I go. ;)

On Fri, 2008-06-20 at 18:32 +1000, Tim Allen wrote:
True; if I had to make a complaint about t.p.log, it would be that this system for arbitrary event-sources to communicate with arbitrary event-observers needs a well-defined interface to facilitate communication. Currently, the only things an event-observer can rely on are the 'isError' and 'system' fields being present, and probably 'message' and 'time'. Any observer winds up either depending on extra fields set by a specific event-source (and hoping that no other event-source happens to use the same field-names for different purposes), or having to treat every event as a raw text event - defeating the point of logging complicated events in the first place.
The plan was (is) to switch to requiring a interface=IFooMessageType keyword. Some day :(

On Fri, 20 Jun 2008 09:40:25 -0400, Itamar Shtull-Trauring <itamar@itamarst.org> wrote:
On Fri, 2008-06-20 at 18:32 +1000, Tim Allen wrote:
True; if I had to make a complaint about t.p.log, it would be that this system for arbitrary event-sources to communicate with arbitrary event-observers needs a well-defined interface to facilitate communication. Currently, the only things an event-observer can rely on are the 'isError' and 'system' fields being present, and probably 'message' and 'time'. Any observer winds up either depending on extra fields set by a specific event-source (and hoping that no other event-source happens to use the same field-names for different purposes), or having to treat every event as a raw text event - defeating the point of logging complicated events in the first place.
The plan was (is) to switch to requiring a interface=IFooMessageType keyword. Some day :(
No changes to twisted.python.log are required for this. Anyone can log messages with an interface key and use that in their log observer to unambiguously interpret the rest of the keys. It might be nice if Twisted's own log events included an interface, but it's not really a very big deal. Jean-Paul

On Thu, 19 Jun 2008 16:21:20 +1000, Tim Allen <tim@commsecure.com.au> wrote:
Hi, all.
I'm working on a new system to be deployed in our production environment, and having already decided on using Twisted, I'm now trying to figure out how to cleanly and reliably deploy it. I have many questions, but right now I just want to work through the issue of logging.
There's a ticket open in the issue tracker for log observer customization, <http://twistedmatrix.com/trac/ticket/638>, which has had some work done on it. If you have time to work on this, you might direct your efforts in the same direction as that work. Jean-Paul
participants (5)
-
glyph@divmod.com
-
Itamar Shtull-Trauring
-
Jean-Paul Calderone
-
Phil Christensen
-
Tim Allen