On Thu, Jun 13, 2013 at 5:14 PM, Glyph <
glyph@twistedmatrix.com> wrote:
Hello everybody. Today I'd like to talk about everyone's favorite subsystem within Twisted: logging.
There are a number of problems with Twisted's logging system.
Thanks for writing up this email. I'd like to help out with this effort.
For a long time, there's been consensus among at least a few core Twisted developers that log messages ought to be specified in terms of a format message and several relevant keyword arguments, so that tools can extract the values of the keyword arguments without parsing ad-hoc messages or using regexes. In other words, like this:
log.msg(format="Foo happened. Relevant information includes: "
"bar=%(bar)s baz=%(baz)s boz=%(boz)s",
bar=bar, baz=baz, boz=boz)
Basically, the conclusion I've come to is that English isn't very
useful in logs. It's usually badly written, uninformative, and hard to
filter on. The main thing that I find really important is having what
I call an "event name", which has the following properties:
So, I confess that one of my long-term goals here is to de-emphasize the english-language log file and eventually get Twisted using a structured log file as standard. However, to just do that immediately would probably break everybody's workflow.
1. human readable
2. computer readable
3. mandatory
4. unique across a codebase, and specific to the physical logging statement that it's being passed to.
I agree with some of these properties and disagree with others, but you didn't really say *why* these properties are useful. So let me see if I can explain the properties I think are good, starting with the use-case rather than the other way 'round:
There are several activities that one performs with logging:
- while one's writing the code, inserting a log message
- while one's administering a system, reading the history of all logged messages
- while one's monitoring a system, selecting criteria for future relevant log messages to alert on
- while one's debugging a system, selecting criteria for past relevant log messages to display
- while one's documenting a system, explaining what various log messages mean and how they may be used for the other use-cases
English (by which I mean "localizable human-readable text") is useful for use-cases 1. and 2. When you're writing the log message, selecting a unique identifier is tedious; you just want to type in what happened and move on. When you're reading the messages, it's easier to puzzle out some english text than to go to the documentation for each and every message.
Globally unique identifiers are useful for use-cases 3 and 4. Filtering messages by random snippets of English text is tedious, error-prone and fragile. (It doesn't help if a message is unique "across a codebase" if that codebase is a library and it conflicts with another library that an application uses.)
Stable identifiers are particularly useful for use-cases 3 and 5. While human-readability is useful for all cases, that generally conflicts with the long-term stability of an identifier. If you write a monitoring system that looks at log messages.
I think that we can satisfy all of these requirements by expanding the notion of an "identifier" a bit.
In the interests of not spending ten hours working on each message to this list, I'll leave this one on a cliffhanger, so you can eagerly await the next episode!
So, for example:
def frag_message_received(self, fragger, fragee, weapon):
self.log("frag", fragger=fragger, fraggee=fragee, weapon=weapon)
...
Now, your smart log filterer can just filter by event="frag" instead
of message="Frag for \w+ received with weapon \w+" or whatever.
This is just how I've done things in practice, and it has worked very
well for fairly large projects, but there are improvements that I can
think of. For example, the event name should probably be based on a
hierarchy of composable logger objects (kinda like the one David Reid
wrote) so that it's easier to maintain uniqueness in a large codebase.
(Hey David, you should write an email about that logging class :)