String Format Callable Flag (Was: Efficient Debug Logging)

In the debug logging thread, somebody suggested the "%r" format specifier and a custom __repr__. This is a neat solution because Python logging already includes a "delayed" evaluation of sorts: it formats the logging string *after* it determines that the message's log level is greater than or equal to the logger's level. However, wrapping up every expensive debugging computation in a new class might get tedious, so here's a strawman proposal for something lighter weight. Python has two string formatting mini-languages. Both allow formatting flags, for example in "%03d", the "0" (zero) is a flag that means to pad with leading zeroes. I propose to add a string format flag to both mini-languages that explicitly says, "this argument is callable, and its *return value* should be formatted." In current Python: >>> expensive = lambda: 2 >>> logging.debug('%03d %03d', 1, expensive()) In this case, the "expensive" code is executed even though nothing is logged. >>> expensive = lambda: 2 >>> logging.root.setLevel(logging.DEBUG) >>> logging.debug('%03d %03d', 1, expensive()) DEBUG:root:001 002 With a different log level, the expensive code is executed and the message is logged. The suggested change is to add a "C" flag to the formatting language that indicates an argument is callable. When formatting the string, the argument will be called and its result will be used for the formatting operation. The C flag must be handled before any other flags, regardless of the order in which they are specified. The callable's return value is then formatted according to the rest of the specifier. With this change, the above example can now be written as: >>> expensive = lambda: 2 >>> logging.debug('%03d %C03d', 1, expensive) The "expensive" code is not executed at this log level. >>> expensive = lambda: 2 >>> logging.root.setLevel(logging.DEBUG) >>> logging.warn('%03d %C03d', 1, expensive) WARNING:root:001 002 The expensive code is *only* executed when the message will be logged. The callable's return value is formatted with the specifier '%03d', i.e. the same specifier as the callable but without the "C" flag. Pros: 1) Much smaller change to the language. 2) Simplifies a common (?) problem with logging. 3) Applies to all string formatting, not just logging. (But I'm not sure what the other use cases are.) Cons: 1) Doesn't generalize as well as the various "delayed" proposals. (But I'm not sure what other use cases "delayed" has.)

On 2017-02-17 16:37, Mark E. Haase wrote:
A letter usually marks the end of the format, e.g. '%03d', so '%C03d' could be read as a format '%C' (currently undefined) followed by '03d'. I'd suggest using some punctuation character instead. It would also be a good idea for the 'new-style' format strings to use the same character.

On Fri, Feb 17, 2017 at 11:37:04AM -0500, Mark E. Haase wrote:
Python has two string formatting mini-languages.
Four. % string formatting, .format method, f-strings, string.Template strings. But who's counting? :-)
The % string format codes are intentionally limited to more-or-less the similar codes available in C, and won't be given any newer functionality. It's really only f-strings and .format method that this change could be applied to. [...]
I see three problems: (1) It will be a bug magnet. People will accidently write logging.debug('%03d %C03d', 1, expensive()) and then not only will their code still be slow, but they'll have to debug mysterious TypeError: 'int' object is not callable exceptions, but only *sometimes*. Most insideously, these Heisenbugs will only occur when they turn the log level all the way up to debugging, which will crash their program *before* logging the error! (2) It requires the expensive calculation to be wrapped in a lambda: logging.debug('%03d %C03d', 1, lambda: nth_prime(10**8) + 1) which I guess is kind of Python's way of spelling a thunk, but people don't like using lambda for that. (3) It is useless for delaying evaluation of something that isn't going to be converted into a string. -- Steven

On 2/18/2017 2:25 AM, Steven D'Aprano wrote:
Technical correctness is the best kind! But string.Template has no formatting language, and str.format and f-strings are identical as far as what they do with format specifiers (by using the underlying .__format__() machinery). [0]
I agree it would be difficult to add this to %-formatting, which is unfortunate. The most often cited need for this feature is for logging. Maybe someone smarter than me could devise an interface to logging which supports .format() formatting, and we could add the callable flag to .format(). Maybe return a different type of logger from logging.getLogger that supports .format?
It's really only f-strings and .format method that this change could be applied to.
I'd suggest adding an explicit conversion flag to the .format() string parser. You'll recall the existing flags are !a, !s, and !r. We could add a !c which calls its argument (with zero parameters) before formatting it. For example: '{!c:%Y-%m-%d}'.format(datetime.datetime.now) '2017-02-18' This would have the advantage that the format specifier would apply to the result of the callable. If logging were made smart enough (or a new interface added), then you could write: logger.info('{!c:%Y-%m-%d} error at {}', datetime.datetime.now, lineno) Then .now() would never be called unless the logging module needed the value. (I realize .now() isn't a great example for calling a function in the future. Substitute any expensive function.)
I think this is lessened with my proposal to use !c, but it's definitely still an issue. Test your logging!
This doesn't bother me so much.
(3) It is useless for delaying evaluation of something that isn't going to be converted into a string.
True. But that's the most common use case I see presented, and it's much easier to reason about and implement than some of the discussions going on in other threads. Just for completeness, I'd add !c to the f-string parser. Like !s, !r, and !a, it would not be needed [1]. Why write: f'{!c:function}' when you could write: f'{function()}' But, it can't hurt to be consistent. Eric. [0]: there is one slight difference in how str.format and f-strings handle expressions, but their format specifiers are identical https://www.python.org/dev/peps/pep-0498/#differences-between-f-string-and-s... [1]: https://www.python.org/dev/peps/pep-0498/#id45

On 2/19/17, Eric V. Smith <eric@trueblade.com> wrote:
On 2/18/2017 2:25 AM, Steven D'Aprano wrote:
1. I think that error message could be understandable at least as this: '{:g}'.format('a') ValueError: Unknown format code 'g' for object of type 'str' with something like isinstance(arg, callable) 2. Static checker could check bracket presence if there is !c format specifier and fire warning. 3. There could be another problem if return type is callable too...

On 2/19/2017 4:23 AM, Pavol Lisy wrote:
Sure, there would be a reasonable error message. The concern (as with logging anything), is that the error case logging is typically poorly tested.
2. Static checker could check bracket presence if there is !c format specifier and fire warning.
But it's valid to have a function that returns a callable, so these could be false positives. But linters can do whatever they want, I guess.
3. There could be another problem if return type is callable too...
I don't think that's a problem. You'd just print the callable, as you can do today:

On 2017-02-17 16:37, Mark E. Haase wrote:
A letter usually marks the end of the format, e.g. '%03d', so '%C03d' could be read as a format '%C' (currently undefined) followed by '03d'. I'd suggest using some punctuation character instead. It would also be a good idea for the 'new-style' format strings to use the same character.

On Fri, Feb 17, 2017 at 11:37:04AM -0500, Mark E. Haase wrote:
Python has two string formatting mini-languages.
Four. % string formatting, .format method, f-strings, string.Template strings. But who's counting? :-)
The % string format codes are intentionally limited to more-or-less the similar codes available in C, and won't be given any newer functionality. It's really only f-strings and .format method that this change could be applied to. [...]
I see three problems: (1) It will be a bug magnet. People will accidently write logging.debug('%03d %C03d', 1, expensive()) and then not only will their code still be slow, but they'll have to debug mysterious TypeError: 'int' object is not callable exceptions, but only *sometimes*. Most insideously, these Heisenbugs will only occur when they turn the log level all the way up to debugging, which will crash their program *before* logging the error! (2) It requires the expensive calculation to be wrapped in a lambda: logging.debug('%03d %C03d', 1, lambda: nth_prime(10**8) + 1) which I guess is kind of Python's way of spelling a thunk, but people don't like using lambda for that. (3) It is useless for delaying evaluation of something that isn't going to be converted into a string. -- Steven

On 2/18/2017 2:25 AM, Steven D'Aprano wrote:
Technical correctness is the best kind! But string.Template has no formatting language, and str.format and f-strings are identical as far as what they do with format specifiers (by using the underlying .__format__() machinery). [0]
I agree it would be difficult to add this to %-formatting, which is unfortunate. The most often cited need for this feature is for logging. Maybe someone smarter than me could devise an interface to logging which supports .format() formatting, and we could add the callable flag to .format(). Maybe return a different type of logger from logging.getLogger that supports .format?
It's really only f-strings and .format method that this change could be applied to.
I'd suggest adding an explicit conversion flag to the .format() string parser. You'll recall the existing flags are !a, !s, and !r. We could add a !c which calls its argument (with zero parameters) before formatting it. For example: '{!c:%Y-%m-%d}'.format(datetime.datetime.now) '2017-02-18' This would have the advantage that the format specifier would apply to the result of the callable. If logging were made smart enough (or a new interface added), then you could write: logger.info('{!c:%Y-%m-%d} error at {}', datetime.datetime.now, lineno) Then .now() would never be called unless the logging module needed the value. (I realize .now() isn't a great example for calling a function in the future. Substitute any expensive function.)
I think this is lessened with my proposal to use !c, but it's definitely still an issue. Test your logging!
This doesn't bother me so much.
(3) It is useless for delaying evaluation of something that isn't going to be converted into a string.
True. But that's the most common use case I see presented, and it's much easier to reason about and implement than some of the discussions going on in other threads. Just for completeness, I'd add !c to the f-string parser. Like !s, !r, and !a, it would not be needed [1]. Why write: f'{!c:function}' when you could write: f'{function()}' But, it can't hurt to be consistent. Eric. [0]: there is one slight difference in how str.format and f-strings handle expressions, but their format specifiers are identical https://www.python.org/dev/peps/pep-0498/#differences-between-f-string-and-s... [1]: https://www.python.org/dev/peps/pep-0498/#id45

On 2/19/17, Eric V. Smith <eric@trueblade.com> wrote:
On 2/18/2017 2:25 AM, Steven D'Aprano wrote:
1. I think that error message could be understandable at least as this: '{:g}'.format('a') ValueError: Unknown format code 'g' for object of type 'str' with something like isinstance(arg, callable) 2. Static checker could check bracket presence if there is !c format specifier and fire warning. 3. There could be another problem if return type is callable too...

On 2/19/2017 4:23 AM, Pavol Lisy wrote:
Sure, there would be a reasonable error message. The concern (as with logging anything), is that the error case logging is typically poorly tested.
2. Static checker could check bracket presence if there is !c format specifier and fire warning.
But it's valid to have a function that returns a callable, so these could be false positives. But linters can do whatever they want, I guess.
3. There could be another problem if return type is callable too...
I don't think that's a problem. You'd just print the callable, as you can do today:
participants (5)
-
Eric V. Smith
-
Mark E. Haase
-
MRAB
-
Pavol Lisy
-
Steven D'Aprano