[Python-Dev] transitioning from % to {} formatting

Vinay Sajip vinay_sajip at yahoo.co.uk
Thu Oct 1 01:03:16 CEST 2009


Steven Bethard <steven.bethard <at> gmail.com> writes:

> There's a lot of code already out there (in the standard library and
> other places) that uses %-style formatting, when in Python 3.0 we
> should be encouraging {}-style formatting. We should really provide
> some sort of transition plan. Consider an example from the logging
> docs:
> 
> logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
> 
> We'd like to support both this style as well as the following style:
> 
> logging.Formatter("{asctime} - {name} - {levelname} - {message}")
> 

In logging at least, there are two different places where the formatting issue
crops up.

The first is creating the "message" part of the the logging event, which is
made up of a format string and arguments.

The second is the one Steven's mentioned: formatting the message along with
other event data such as time of occurrence, level, logger name etc. into the
final text which is output.

Support for both % and {} forms in logging would need to be considered in
these two places. I sort of liked Martin's proposal about using different
keyword arguments, but apart from the ugliness of "dicttemplate" and the fact
that "fmt" is already used in Formatter.__init__ as a keyword argument, it's
possible that two different keyword arguments "fmt" and "format" both referring
to format strings might be confusing to some users.

Benjamin's suggestion of providing a flag to Formatter seems slightly better,
as it doesn't change what existing positional or keyword parameters do, and
just adds an additional, optional parameter which can start off with a default
of False and transition to a default of True.

However, AFAICT these approaches only cover the second area where formatting
options are chosen - not the creation of the message from the parameters passed
to the logging call itself. 

Of course one can pass arbitrary objects as messages which contain their own
formatting logic. This has been possible since the very first release but I'm
not sure that it's widely used, as it's usually easier to pass strings. So
instead of passing a string and arguments such as

logger.debug("The %s is %d", "answer", 42)

one can currently pass, for a fictitious class PercentMessage,

logger.debug(PercentMessage("The %s is %d", "answer", 42))

and when the time comes to obtain the formatted message, LogRecord.getMessage
calls str() on the PercentMessage instance, whose __str__ will use %-formatting
to get the actual message.

Of course, one can also do for example

logger.debug(BraceMessage("The {} is {}", "answer", 42))

where the __str__() method on the BraceMessage will do {} formatting.

Of course, I'm not suggesting we actually use the names PercentMessage and
BraceMessage, I've just used them there for clarity.

Also, although Raymond has pointed out that it seems likely that no one ever
needs *both* types of format string, what about the case where application A
depends on libraries B and C, and they don't all share the same preferences
regarding which format style to use? ISTM no-one's brought this up yet, but it
seems to me like a real issue. It would certainly appear to preclude any
approach that configured a logging-wide or logger-wide flag to determine how to
interpret the format string.

Another potential issue is where logging events are pickled and sent over
sockets to be finally formatted and output on different machines. What if a
sending machine has a recent version of Python, which supports {} formatting,
but a receiving machine doesn't? It seems that at the very least, it would
require a change to SocketHandler and DatagramHandler to format the "message"
part into the LogRecord before pickling and sending. While making this change
is simple, it represents a potential backwards-incompatible problem for users
who have defined their own handlers for doing something similar.

Apart from thinking through the above issues, the actual formatting only
happens in two locations - LogRecord.getMessage and Formatter.format - so
making the code do either %- or {} formatting would be simple, as long as it
knows which of % and {} to pick.

Does it seems too onerous to expect people to pass an additional "use_format"
keyword argument with every logging call to indicate how to interpret the
message format string? Or does the PercentMessage/BraceMessage type approach
have any mileage? What do y'all think?

Regards,

Vinay Sajip




More information about the Python-Dev mailing list