[stdlib-sig] logging

Fri Sep 18 17:59:35 CEST 2009

Jesse Noller <jnoller at ...> writes:

> I've kept my mouth shut (well, on this subject) simply due to the fact
> I tend to feel API design is a bit of a "smell" thing. First off;
> thank you for the package. As much as I might not like the API - I use
> the logging package an *insane* amount. I also know you're responsive,
> and you care strongly about it. I myself might come off as slightly
> defensive for my own module (multiprocessing) if someone where to just
> say "lol it sux hahaha" (in fact, I have).
> 
> You are right; it is flexible, and it is meant for a wide-range of use
> cases, and when you really "get it" it can be wildly powerful (I think
> David Beazley wrote a twitter handler!).

It was only a matter of time. Next, a Google Wave handler. ;-)

> However, my particular gut feeling when dealing with it stems from
> something I can't quite communicate properly. I *want* something
> "simpler" - for example, something which logs messages at a certain
> level to stderr (but not stdout) and stdout to stdout (but not stderr)
> - but also has a file logger.
> 
> I don't want to have to write more than a few lines of code to do this
> - in *my* mind this is something so fundamental to
> unixy-scripts/daemons than it should be as simple as:
> 
> import logging
> 
> log = logging.get_log('mylog')
> log.warning('hay guys')
> 
> What I end up doing in most of my projects (sorry, not public ones) is
> wrapping this in a "jesse.log" module that offers that API. The user
> does not see the complexity of the underlying logging module's APIs.
> In fact, I have a nasty tendency to create one "log" object which also
> has a fair amount of the logging module's API pushed into it, e.g.:
> 
>     from jesse import log
> 
>     log = log()
>     log.critical('yay!')
>     log.set_level(log.WARNING)  # I loathe BouncyNames
>     log.add_handler(log.file_handler(level=log.CRITICAL))

One reason why people think logging is complicated is that they typically
have to contend with two things - loggers *and* handlers - whereas they have
perhaps in just used a logger, which either logs to file, or to console, or
whatever. For simple applications, this might work. However, for more
sophisticated requirements, it really helps to keep the two concepts
separate: "Where did it happen?" (the logger) and "Who wants to know?" (the
handler). In my experience, these are in general not mapped one-to-one. In a
typical enterprise use case, a critical error might page an on-call member of
the support team, write a detailed message to log, print a simplified message
to console, and email the incident to a developer team mailbox.

The logger is the domain of the developer. The handler is the domain of the
deployer. In many scenarios, of course, they are the same constituency - but
in many other scenarios, it's not so.

On the BouncyName thing - yes I know it's a religious thing for some people.
I've worked in Python, Java and C# amongst others and I've seen it all. My
cardinal rule is to fit in with whatever's already there, and although PEP 8
mandates names with no bounce, the logging package had already been released
and was in use before being proposed for inclusion in Python, and was using
camelCaseNames. Furthermore, the stdlib didn't at the time religiously stick
to PEP 8 in this regard. As I'm agnostic on this point, I've no problem in
providing all the method names using unix_style_underscore_notation, though
of course the other names will still be there for backwards compatibility :-)

Where you say "smell", I say "taste". It has a more positive connotation, but
means more or less the same thing in this context. For instance, I would not
naturally use "log" to name a logger, as in my mind, a logger is something
that facilitates getting some information into a log, and the log is the
final substrate for that information - the file, the console screen, the
email or whatever. So IMO there are elements of personal taste which we just
have to pass over when considering API design in general and naming in
particular, and when working with others who may not think exactly as we do.

I mentioned that the handler is the domain of the deployer, by which I mean
that using hard-coded set_level and add_handler calls in code is not always
the right thing to do. (To be sure, it is the right thing to do on many
occasions). Of course your simple basic example is doable as

import logging, sys

logging.basicConfig(level=logging.DEBUG, stream=sys.stderr)
logger = logging.getLogger('myapp')
logger.warning('hay guys')

which is only one line longer than your snippet - but in general, it's better
if the verbosity of logging in an application can be turned up and down
without having to change the application. It's not uncommon for people to use
configuration files to configure levels and handlers, and in the case of
long-running daemon processes this can even be done without stopping the
process.

The Python logging package's configuration format uses ConfigParser, which
doesn't float everybody's boat. If it is a cause of extreme hatred or even
mild distaste, you can always come up with your own configuration file format
which is more to your liking, and configure programmatically from that.

Why ConfigParser? one may ask. I don't believe that introducing my own,
different, ad-hoc configuration file format would have been the right thing
to do. TOOWTDI, and all that. Also, at the time, it was important not to add
too many APIs to the system, using the guiding principle of YAGNI (You Aren't
Gonna Need It), particularly if they were things that would perhaps be more
open to subjective judgements. Feel free to search the python-dev mail
archives for YAGNI in the late 2001-early 2002 timeframe. If I come up with
a simple easy-to-use wrapper that you like, then it's quite possible that
some other smart-and-opinionated developer will complain that it doesn't do
it for him/her. So - if it's just five lines of code in a project, even if
it's for every project, then it's hardly worth secreting much bile over, is
it? (I'm not saying *you* are.)

There are also some archaic constructs which result from trying to provide
Python 1.5.2 compatibility. At the time, there was a good reason for this -
most Linux distros were shipping with 1.5.2, although 2.2 had been around for
some time. I felt at the time that was important to support developers and
sysadmins who would have had to use the system's Python rather than upgrading
to 2.3. Of course the landscape has now changed considerably, and I am not of
the opinion that 1.5.2 support is stll important. The archaic relics will
disappear over time, no doubt.

> The other aspect of this is my experience trying to explain logging to
> people who have never dealt with that module before. Recently, I was
> trying to explain it to someone who has limited python knowledge, and
> really just wanted something like what I describe above. They read the
> docs, re-read the docs, re-re-re-read the docs, and still came to me
> and said "how on earth do I do this?!". The API doesn't (and this
> again, is a smell thing) feel python-y - it feels very Java like
> (having experience log4j, I can say it really does feel like it). I
> think this trips newbies up quite a bit.

I don't know why, particularly, apart from the conflating-handlers-and-loggers
thing. But, as I explained above, that separation of concerns is there for a
reason. I'm well aware of the Java predilection for over-complicating things,
and I have not drunk that particular Kool-Aid.

I don't blog, but following a recent discussion with Glyph Lefkowitz of
Twisted Matrix, I created a blog where I will put information related to
Python logging which perhaps doesn't belong in the documentation. The first
post is, reasonably enough, entitled "Python Logging 101" and is available at

http://plumberjack.blogspot.com/2009/09/python-logging-101.html

Perhaps you could take a look at it. Is really is a bit of a 101 in terms of
going from first principles; it's distilled from an article draft I had
prepared for DDJ back in 2002, but which never got accepted :-( [In those
days, DDJ was a prestigious print magazine, not vendor-aligned, rather than a
website-with-mailshots which is hard to distinguish from a Microsoft
tentacle.] It's only a five-minute-or-so read, and perhaps might explain why
logging's broad design is as it is.

> Part of the newb-to-not-newb transition would be helped by possibly
> simplifying the docs (something I am *still* working on for
> multiprocessing) - the examples can be like drinking from a fire hose.

Well the basic example is

import logging

logging.debug('A debug message')
logging.info('Some information')
logging.warning('A shot across the bows')

which is hardly challenging, and it builds up from there.

> Doug Hellmann - the author of the Python Module of the Week is an
> *excellent* doc writer (see his logging write up here:
> http://www.doughellmann.com/PyMOTW/logging/index.html#module-logging)
> and might be willing to help, give pointers, what have you.

I'll definitely take it up with Doug. AFAIK, the current introductory
documentation on logging owes, I believe, something to Doug's PyMOTW
entry for logging, but I'm not sure as I didn't write that material.
(I can't remember who did, but it was in response to "logging is
complicated" feedback.)

> Like I said at the start - this is all a "smell" thing, and it
> obviously varies from person to person. This is fundamentally why I
> was interested in encouraging Mishok and others to put together
> concrete ideas together (I would be interested in seeing an
> alternative implementation as a thought exercise). I know others
> besides me have written little wrappers around logging, for example:
> 
> http://pypi.python.org/pypi/easylog/
> http://pypi.python.org/pypi/autolog/
> http://pypi.python.org/pypi/sensible/
> 
> Perhaps that's a good place to start - higher level
> functions/methods/etc to "scale down" loggings perceived complexity? I
> know I'm trying to do bits of that for multiprocessing.

I'm not sure how much traction they've got. Anybody know? The very fact that
there are three seems to indicate that this is an area where subjective
judgements play a part. So, people can pick whichever one they want, and off
they go. If I incorporated one of their versions into the stdlib, presumably
the others would still be around, along with your variation on the same theme.

> Then of course, there's time for something completely different ;)
> 
> http://code.zacharyvoase.com/lumberjack/src/
> 

Is Zachary actually serious about this as an alternative to Python logging?
(I'm not dissing it - just asking if he is seriously committed to getting it
to have the same level of functionality.) From the screencast, I got the
impression (perhaps mistakenly) that it was motivated at least in part by "Oh
look! Coroutines! Shiny new Python toys. Let's see what we can do with them!"
After an initial flurry of work on it, things have gone quiet over the last
four weeks. Perhaps he's not pushed his changes to BitBucket because he's
still working on them.

Coroutines are undeniably nice for some things - but I can't say I see a
no-brainer fit with logging. I know coroutines are new in Python but I've
seen them come and go in popularity a couple of times in different
environments.

Regards,

Vinay Sajip