transitioning from % to {} formatting

There's a lot of code already out there (in the standard library and other places) that uses %-style formatting, when in Python 3.0 we should be encouraging {}-style formatting. We should really provide some sort of transition plan. Consider an example from the logging docs: logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s") We'd like to support both this style as well as the following style: logging.Formatter("{asctime} - {name} - {levelname} - {message}") Perhaps we'd eventually deprecate the %-style formatting, but at least in the intervening time, we'd have to support both. I see a few possibilities: * Add a new class, NewFormatter, which uses the {}-style formatting. This is ugly because it makes the formatting we're trying to encourage look like the alternative implementation instead of the standard one. It also means we have to come up with new names for every API that uses format strings. * Have Formatter try to guess whether it got %-style formatting or {}-style formatting, and then delegate to the appropriate one. I don't know how accurately we can guess. We also end up still relying on both formatting styles under the hood. * Have Formatter convert all %-style formatting strings to {}-style formatting strings (automatically). This still involves some guessing, and involves some serious hacking to translate from one to the other (maybe it wouldn't even always be possible?). But at least we'd only be using {}-style formatting under the hood. Thoughts? Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

Steven Bethard wrote:
I don't agree that we should do that. I see nothing wrong with using % substitution.
We should really provide some sort of transition plan.
-1.
Now, that's different. IIUC, you are not actually performing the substitution here, but only at a later place. So changing to the new formatting mechanism is an API change. I agree that the new placeholder syntax needs to be supported.
It's also ugly because the class has the word "new" in its name, which no class should ever have. In a few years, it would still be around, but not new anymore.
I don't see the point of having a converter. The tricky part, as you say, is the guessing. Whether the implementation then converts the string or has two alternative formatting algorithms is an implementation detail. I would favor continued use of the actual % substitution code. I would propose that the format argument gets an argument name, according to the syntax it is written in. For PEP 3101 format, I would call the argument "format" (like the method name of the string type), i.e. logging.Formatter( format="{asctime} - {name} - {levelname} - {message}") For the % formatting, I suggest "dicttemplate" (assuming that you *have* to use dictionary %(key)s style currently). The positional parameter would also mean dicttemplate, and would be deprecated (eventually requiring a keyword-only parameter). Regards, Martin

On Tue, Sep 29, 2009 at 8:15 PM, "Martin v. Löwis" <martin@v.loewis.de> wrote:
Just to be clear, I don't think logging is the only place these kind of things happen. Some others I found looking around: BaseHTTPServer.BaseHTTPRequestHandler.error_message_format http://docs.python.org/library/basehttpserver.html#BaseHTTPServer.BaseHTTPRe... BaseHTTPServer.BaseHTTPRequestHandler.log_message http://docs.python.org/3.1/library/http.server.html#http.server.BaseHTTPRequ... email.generator.DecodedGenerator http://docs.python.org/library/email.generator.html#email.generator.DecodedG... There may be more.
This is a nice solution for the cases where we can be confident that the parameter is currently only used positionally. However, at least in Python 3.1, "fmt" is already documented as a keyword parameter: http://docs.python.org/3.1/library/logging.html#logging.Formatter I guess we could follow the same approach though, and have fmt= be the %-style formatting, and use some other keyword argument for {}-style formatting. We've got a similar problem for the BaseHTTPRequestHandler.error_message_format attribute. I guess we'd want to introduce some other attribute which is the error message format for the {}-style formatting? Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

I'm resending a message I sent in June, since it seems the same thread has come up again, and I don't believe anybody actually responded (positively or negatively) to the suggestion back then. http://mail.python.org/pipermail/python-dev/2009-June/090176.html On Jun 21, 2009, at 5:40 PM, Eric Smith wrote:
It'd possibly be helpful if there were builtin objects which forced the format style to be either newstyle or oldstyle, independent of whether % or format was called on it. E.g. x = newstyle_formatstr("{} {} {}") x % (1,2,3) == x.format(1,2,3) == "1 2 3" and perhaps, for symmetry: y = oldstyle_formatstr("%s %s %s") y.format(1,2,3) == x % (1,2,3) == "1 2 3" This allows the format string "style" decision is to be made external to the API actually calling the formatting function. Thus, it need not matter as much whether the logging API uses % or .format() internally -- that only affects the *default* behavior when a bare string is passed in. This could allow for a controlled switch towards the new format string format, with a long deprecation period for users to migrate: 1) introduce the above feature, and recommend in docs that people only ever use new-style format strings, wrapping the string in newstyle_formatstr() when necessary for passing to an API which uses % internally. 2) A long time later...deprecate str.__mod__; don't deprecate newstyle_formatstr.__mod__. 3) A while after that (maybe), remove str.__mod__ and replace all calls in Python to % (used as a formatting operator) with .format() so that the default is to use newstyle format strings for all APIs from then on.

James Y Knight wrote:
I must have missed this suggestion when it went past the first time. I certainly like this approach - it has the virtue of only having to solve the problem once, and then application developers can use it to adapt any existing use of %-mod formatting to str.format formatting. Something like: class formatstr(str): def __mod__(self, other): if isinstance(other, dict): return self.format(**dict) if isinstance(other, tuple) return self.format(*other) return self.format(other) APIs that did their own parsing based on %-formatting codes would still break, as would any that explicitly called "str" on the object (or otherwise stripped the subclass away, such as via "'%s' % fmt"), but most things should pass a string subclass through transparently. I wouldn't bother with a deprecation plan for 'normal' %-formatting though. I don't think it is going to be practical to actually get rid of that short of creating Python 4.0. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Tue, Sep 29, 2009 at 10:04 PM, James Y Knight <foom@fuhm.net> wrote:
So I understand how this might help a user to move from %-style formatting to {}-style formatting, but it's still not clear to me how to use this to evolve an API. In particular, if the goal is for the API to move from accepting %-style format strings to {}-style format strings, how should that API use newstyle_formatstr or oldstyle_formatstr to make this transition? Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

2009/9/30 Steven Bethard <steven.bethard@gmail.com>:
IIUC, the API doesn't change. It's just that the internal code goes as follows: 1. (Current) Use %. str and oldformat objects work as now, newformat objects work (using .format). 2. Convert the code to use .format - oldformat and newformat objects work as before, str objects must be in new format. The problem is, there's a point at which bare str arguments change behaviour. So unless people wrap their arguments when calling the API, there's still a point when things break (albeit with a simple workaround available). So maybe we need transition steps - wrap bare str objects in oldformat classes, and warn, then wrap str objects in newformat objects and warn, then stop wrapping. That sounds to me like "normal" usage (bare strings) will be annoyingly painful for a substantial transition period. I'm assuming that the oldformat and newformat classes are intended to be transitional things, and there's no intention that users should be using the wrapper objects always. (And of course better names than "oldformat" and "newformat" are needed - as Martin pointed out, having "old" and "new" in the names is inappropriate). Otherwise, I'm a strong -1 on the whole idea. Paul.

Steven Bethard wrote:
The problem is that many (most?) of the problematic APIs (such as logging) will have multiple users in a given application, so getting the granularity of any behavioural switches right is going to be difficult. Providing a formatstr() type that makes .__mod__() work based on a .format() style string (and a formatmod() type that does the opposite) would allow for extremely fine-grained decision making, since every format string will either be an ordinary str instance or else an instance of the formatting subclass. (Note that the primary use case for both proposed types is an application developer adapting between two otherwise incompatible third party libraries - the choice of class just changes based on whether the old code being adapted is the code invoking mod on a format string or the code providing a format string that expects to be used with the mod operator). I don't see any way for delayed formatting of "normal" strings in any existing API to move away from %-formatting except via a long and painful deprecation process (i.e. first warning when bare strings are used for a release or two, then switching entirely to the new formatting method) or by duplicating the API and maintaining the two interfaces in parallel for the foreseeable future. As Paul noted, the two proposed classes may also be useful to the library developer during such a transition process - they could accept strings in the "wrong" format just by wrapping them appropriately rather than having to maintain the parallel APIs all the way through the software stack. Probably worth letting these concepts bake for a while longer, but it definitely be nice to do *something* to help enable this transition in 2.7/3.2. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Wed, 30 Sep 2009 03:04:05 pm James Y Knight wrote:
People will want this formatstr object to behave like strings, with concatenation, slicing, etc.:
Instead of having to support one type with %-formatting and {}-formatting (str), now the std lib will have two classes with %-formatting and {}-formatting. How is this an improvement? Moving along, let's suppose the newstyle_formatstr is introduced. What's the intention then? Do we go through the std lib and replace every call to (say) somestring % args with newstyle_formatstr(somestring) % args instead? That seems terribly pointless to me -- it does nothing about getting rid of % but adds a layer of indirection which slows down the code. Things are no better if the replacement code is: newstyle_formatstr(somestring).format(*args) (or similar). If we can do that, then why not just go all the way and use this as the replacement instead? somestring.format(*args)
Now we have three classes that support both % and {} formatting. Great. [...]
And how are people supposed to know what the API uses internally? Personally, I think your chances of getting people to write: logging.Formatter(newstyle_formatstr("%(asctime)s - %(name)s - %(level)s - %(msg)s")) instead of logging.Formatter("%(asctime)s - %(name)s - %(level)s - %(msg)s") is slim to none -- especially when the second call still works. You'd better off putting the call to newstyle_formatstr() inside logging.Formatter, and not even telling the users. Instead of wrapping strings in a class that makes .__mod__() and .format() behave the same, at some cost on every call presumably, my preferred approach would be a converter function (perhaps taken from 2to3?) which modified strings like "%(asctime)s" to "{asctime}". That cost gets paid *once*, rather than on every call. (Obviously the details will need to be ironed out, and it will depend on the external API. If the external API depends on the caller using % explicitly, then this approach may not work.)
2) A long time later...deprecate str.__mod__;
How long? I hope that's not until I'm dead and buried. -- Steven D'Aprano

On Sep 30, 2009, at 10:34 AM, Steven D'Aprano wrote:
Indeed, that *would* be terribly pointless! Actually, more than pointless, it would be broken, as you've changed the API from taking oldstyle format strings to newstyle format strings. That is not the suggestion. The intention is to change /nearly nothing/ in the std lib, and yet allow users to use newstyle string substitution with every API. Many Python APIs (e.g. logging) currently take a %-type formatting string. It cannot simply be changed to take a {}-type format string, because of backwards compatibility concerns. Either a new API can be added to every one of those functions/classes, or, a single API can be added to inform those places to use newstyle format strings.
It's documented, (as it already must be, today!).
That's not my proposal. The user could write either: logging.Formatter("%(asctime)s - %(name)s - %(level)s - %(msg)s") (as always -- that can't be changed without a long deprecation period), or: logging.Formatter(newstyle_formatstr("{asctime} - {name} - {level} - {msg}") This despite the fact that logging has not been changed to use {}- style formatting internally. It should continue to call "self._fmt % record.__dict__" for backward compatibility. That's not to say that this proposal would allow no work to be done to check the stdlib for issues. The Logging module presents one: it checks if the format string contains "%{asctime}" to see if it should bother to calculate the time. That of course would need to be changed. Best would be to stick an instance which lazily generates its string representation into the dict. The other APIs mentioned on this thread (BaseHTTPServer, email.generator) will work immediately without changes, however. James

James Y Knight wrote:
allow users to use newstyle string substitution with every API.
However it is done, I think someone (like new Python programmers) should be able to program in Python3, and use everything in the stdlib, without ever learning % formatting -- and that I should be able to forget about it ;-). +10 on the goal. Terry Jan Reedy

[Terry Reedy]
If that were possible, it would be nice. But as long as the language supports %-formatting, it is going to be around in one form or another. Any non-casual user will bump into %-formatting in books, in third-party modules, in ASPN recipes, on the newsgroup, and in our own source code. If they maintain any exising software, they will likely encounter too. It doesn't seem to be a subject that can be ignored. Also, I think it premature to say that {}-formatting has been proven in battle. AFAICT, there has been very little uptake. I've personally made an effort to use {}-formatting more often but find that I frequently have to lookup the syntax and need to experiment with the interactive interpreter to get it right. I haven't found it easy to teach or to get other people to convert. This is especially true if the person has encountered %-formatting in other languages (it is a popular approach). Raymond

James Y Knight <foom <at> fuhm.net> writes:
Why not allow logging.Formatter to take a callable, which would in turn call the callable with keyword arguments? Therefore, you could write: logging.Formatter("{asctime} - {name} - {level} - {msg}".format) and then: logging.critical(name="Python", msg="Buildbots are down") All this without having to learn about a separate "compatibility wrapper object". Regards Antoine.

On Sep 30, 2009, at 1:01 PM, Antoine Pitrou wrote:
This is a very interesting idea. Note that one of the reasons to /at least/ support {}-strings also is that %-strings are simply too error prone in many situations. For example, if I decide to support internationalization of log format strings, and all I can use is %-strings, it's almost guaranteed that I will have bugs because a translator forgot the trailing 's'. This exactly the motivation that led to PEP 292 $-strings. In fact, while we're at it, it would be kind of cool if I could use $- strings in log templates. Antoine's idea of accepting a callable might fit that bill nicely. -Barry

Barry Warsaw <barry <at> python.org> writes:
You're already covered if you use the PercentMessage/BraceMessage approach I mentioned elsewhere in this thread. Suppose: #Just typing this in, it's not tested or anything class DollarMessage: def __init__(self, fmt, *args, **kwargs): self.fmt = fmt self.args = args self.kwargs = kwargs def __str__(self): return string.Template(self.fmt).substitute(*args, **kwargs)

Vinay Sajip <vinay_sajip <at> yahoo.co.uk> writes:
Whoops, sorry, pressed the "post" button by accident on my previous post. The above substitute call should of course say string.Template(self.fmt).substitute(*self.args, **self.kwargs) and you can alias DollarMessage (or whatever name you choose) as _ or __, say. As far as the Formatter formatting goes, it's easy enough to subclass Formatter to format using whatever approach you want. Regards, Vinay Sajip

Antoine Pitrou <solipsis <at> pitrou.net> writes:
This seems perhaps usable for a Formatter instantiation (infrequent) but a problem for the case where you want to convert format_str + args -> message (potentially frequent, and less readable). Another problem is that logging calls already use keyword arguments (extra, exc_info) and so backward compatibility might be compromised. It also feels like passing a callable could encourage patterns of usage which restrict our flexibility for future changes: we want for now to just allow choosing between % and {}, but a callable can do anything. That's more flexible, to be sure, but more specialized formatting requirements are already catered for using e.g. the PercentMessage/BraceMessage approach. Regards, Vinay Sajip

Vinay Sajip <vinay_sajip <at> yahoo.co.uk> writes:
Why is it a problem? I don't understand. It certainly is less pleasant to write "{foo}".format or "{0} {1}".format than it is to write "{0} {1}" alone, but it's still prettier and easier to remember than the special wrappers people are proposing here.
Then logging can just keep recognizing those special keyword arguments, and forward the others to the formatting function.
It also feels like passing a callable could encourage patterns of usage which restrict our flexibility for future changes:
Which future changes are you thinking about? AFAIK, there hasn't been a single change in logging output formatting in years. Rejecting a present change on the basis that it "restricts our flexibility for future changes" sounds like the worst kind of argument to me :-)
Except that having to wrap format strings with "PercentMessage" or "BraceMessage" is horrible. Python is not Java. Regards Antoine.

Antoine Pitrou <solipsis <at> pitrou.net> writes:
Well, it's less readable, as I said in parentheses. It would work, of course. And the special wrappers needn't be too intrusive: __ = BraceMessage logger.debug(__("Message with {0} {1}", 1, "argument"))
Then logging can just keep recognizing those special keyword arguments, and forward the others to the formatting function.
It just means that you can't pass those values through, and what if some of them are used somewhere in existing code?
It's the Rumsfeldian "We don't know what we don't know" ;-)
Now don't get upset and take it as a rejection, as we're still in the kicking-ideas-around stage ;-) I'm just saying how it feels to me. I agree that logging output formatting hasn't changed in years, and that's because there's been no particular need for it to change (some changes *were* made in the very early days to support a single dict argument). Now that time for change has perhaps come. I'm just trying to think ahead, and can't claim to have got a definitive answer up my sleeve. Passing a callable has upsides and downsides, and ISTM it's always worth focusing on the downsides to make sure they don't come back and bite you later. I don't foresee any specific problem - I'm just uneasy about it.
Except that having to wrap format strings with "PercentMessage" or "BraceMessage" is horrible. Python is not Java.
Amen. I'd say "Yeccchh!" too, if it literally had to be like that. And I also note that there are voices here saying that support for %-formatting shouldn't, or doesn't need to, change, at least until Python 4.0. So consider the following tentative suggestion, which is off the top of my head and offered as a discussion point: Suppose that if you want to use %-formatting, everything stays as is. No backward-compatibility headaches. To support {}-formatting, add an extra class which I've called BraceMessage. Consider this name a "working title", as no doubt a better name will suggest itself, but for now this name makes it clear what we're talking about. If any module wants to use {} formatting for their logging, they can add the line from logging import BraceMessage as __ I've used two underscores, since _ might be being used for gettext, but obviously the importer can use whatever name they want. and then they can use logger.debug(__("The {0} is {1}", "answer", 42)) which I think is more readable than putting in ".format" following the string literal. It's not a *huge* point, perhaps, but "Readability counts". This has the side benefit that if e.g. Barry wanted to use string.Template for formatting, he's just got to replace the above import with something like from logging import DollarMessage as __ Another "working title", please note. And while I've shown these classes being imported from logging, it doesn't make sense to put them there if this idea were to fly in a more general context. Then, perhaps string would be a better home for these classes. Regards, Vinay Sajip

Hello,
Ah, I hadn't thought about that. It looks a bit less awful indeed. I'm of the opinion, however, that %-formatting should remain the default and shouldn't need a wrapper. There's another possibility, which is to build the wrapping directly around the logger. That is, if I want a %-style logger, I do: logger = logging.getLogger("smtp") logger.debug("incoming email from %s", sender_address) and I want a {}-style logger, I do: logger = logging.getLogger("smtp", style="{}") logger.debug("incoming email from {addr}", addr=sender_address) (of course, different users of the "smtp" logger can request different formatting styles when calling getLogger().) We could combine the various proposals to give users flexible APIs. Of course, it generally smells of "there's more than one way to do it".
It's the Rumsfeldian "We don't know what we don't know"
Is this guy in the Python community? :-)
I'm just trying to think ahead, and can't claim to have got a definitive answer up my sleeve.
Sure, we have some time until 2.7/3.2 anyway. Regards Antoine.

Antoine Pitrou <solipsis <at> pitrou.net> writes:
There's a LoggerAdapter class already in the system which is used to wrap loggers so that additional contextual information (e.g. network or database connection information) can be added to logs. The LoggerAdapter could fulfill this "wrapping" function.
We could combine the various proposals to give users flexible APIs. Of course, it generally smells of "there's more than one way to do it".
Yeah, that bothers me too.
It's the Rumsfeldian "We don't know what we don't know"
Is this guy in the Python community?
Not sure, but I believe he's a piece of work and not a guy to get on the wrong side of ;-) Regards, Vinay Sajip

2009/10/1 Vinay Sajip <vinay_sajip@yahoo.co.uk>:
This seems to me to be almost the same as the previous suggestion of having a string subclass: class BraceFormatter(str): def __mod__(self, other): # Needs more magic here to cope with dict argument return self.format(*other) __ = BraceFormatter logger.debug(__("The {0} is {1}"), "answer", 42) The only real differences are 1. The positioning of the closing parenthesis 2. The internal implementation of logger.debug needs to preserve string subclasses properly But the benefit is that the approach allows anyone to use brace formatting in any API that currently accepts % format (assuming string subclasses don't get mangled). On the one hand, I'd prefer a more general solution. On the other, I'm nervous about that "assuming string subclasses..." proviso. I've no real answer, just offering the point up for consideration. Paul.

Paul Moore <p.f.moore <at> gmail.com> writes:
The other difference is that my suggestion supports Barry's desire to use string.Template with no muss, no fuss ;-) Plus, very little additional work is required compared to your items 1 and 2. ISTM BraceMessage would be something like this, clsss BraceMessage: def __init__(self, fmt, *args, **kwargs): self.fmt = fmt self.args = args self.kwargs = kwargs def __str__(self): return self.fmt.format(*self.args, **self.kwargs) Regards, Vinay

On Thu, Oct 1, 2009 at 06:29, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
So I created this last night: import collections class braces_fmt(str): def __mod__(self, stuff): if isinstance(stuff, tuple): return self.__class__(self.format(*stuff)) elif isinstance(stuff, collections.Mapping): return self.__class__(self.format(**stuff)) else: return self.__class__(self.format(stuff)) The biggest issue is that ``"%s" % {'a': 42}`` substitutes the dict instead of throwing an error that str.format() would do with the code above. But what's nice about this is I think I can use this now w/ any library that expects % interpolation and it should basically work.
I don't think Paul's suggestion requires much more work to support string.Template, simply a subclass that implements __mod__>
I guess my question is what's the point of the class if you are simply converting it before you pass it in to the logger? To be lazy about the formatting call? Otherwise you could simply call str.format() with your arguments before you pass the string into the logger and not have to wrap anything. -Brett

So I created this last night:
So there's no need to change modules like logging to explicitly provide support for {}-formatting? What's not to like? ;-) Something like this perhaps should have been added in at the same time as str.format went in.
I don't think Paul's suggestion requires much more work to support string.Template, simply a subclass that implements __mod__
True.
That's exactly the reason - to defer the formatting until it's needed. Otherwise you can always format the string yourself,as you say, and pass it as the single argument in the logging call - logging won't know or care if it was passed in as a literal, or was computed by %-, {}-, $- or any other formatting approach. Regards, Vinay Sajip

Vinay Sajip wrote:
I believe classes like fmt_braces/fmt_dollar/fmt_percent will be part of a solution, but they aren't a complete solution on their own. (Naming the three major string formatting techniques by the key symbols involved is a really good idea though) The two major problems with them: 1. It's easy to inadvertently convert them back to normal strings. If a formatting API even calls "str" on the format string then we end up with a problem (and switching to containment instead of inheritance doesn't really help, since all objects implement __str__). 2. They don't help with APIs that expect a percent-formatted string and do more with it than just pass it to str.__mod__ (e.g. inspecting it for particular values such as '%(asctime)s') Still, it's worth considering adding the three fmt_* classes to the string module to see how far they can get us in adapting the formats for different APIs. Note that I don't think these concepts are fully baked yet, so we shouldn't do anything in a hurry - and anything that does happen should be via a PEP so we can flush out more issues. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Oct 1, 2009, at 5:54 PM, Nick Coghlan wrote:
Using containment instead of inheritance makes sure none of the *other* operations people do on strings will appear to work, at least (substring, contains, etc). I bet explicitly calling str() on a format string is even more rare than attempting to do those things.
True, but I don't think there's many such cases in the first place, and such places can be fixed to not do that as they're found. Until they are fixed, fmt_braces will loudly fail when used with that API (assuming fmt_braces is not a subclass of str). James

James Y Knight <foom <at> fuhm.net> writes:
Actually, logging calls str() on the object passed as the first argument in a logging call such as logger.debug(), which can either be a format string or an arbitrary object whose __str__() returns the format string. Regards, Vinay Sajip

Nick Coghlan <ncoghlan <at> gmail.com> writes:
Good point as far as the general case is concerned, though it's perhaps not that critical for logging. By which I mean, it's not unreasonable for Formatter.__init__ to grow a "style" keyword parameter which determines whether it uses %-, {}- or $-formatting. Then the formatter can look for '%(asctime)s', '{asctime}' or '$asctime' according to the style. Just to clarify - LogRecord.getMessage *will* call str() on a message object if it's not a string or Unicode object. For 2.x the logic is if type(msg) not in (unicode, str): msg = str(msg) and for 3.x the check is for isinstance(msg, str).
Yes, we're just "kicking the tires" on the various ideas. There are things still a bit up in the air such as what happens when pickling and sending to an older version of Python, etc. which still need to be resolved for logging, at least. Regards, Vinay Sajip

Vinay Sajip wrote:
It's tangential, but in the str.format case you don't want to check for just '{asctime}', because you might want '{asctime:%Y-%m-%d}', for example. But there are ways to delay computing the time until you're sure it's actually being used in the format string, without parsing the format string. Now that I think of it, the same technique could be used with %-formatting: import datetime class DelayedStr: def __init__(self, fn): self.fn = fn self.obj = None def __str__(self): if self.obj is None: self.obj = self.fn() return self.obj.__str__() def current_time(): print "calculating time" return datetime.datetime.now() # will not compute current time print '%(msg)s' % {'asctime':DelayedStr(current_time), 'msg':'test'} # will compute current time: same dict used as before print '%(asctime)s %(msg)s' % {'asctime':DelayedStr(current_time), 'msg':'test'} Eric.

2009/10/1 Eric Smith <eric@trueblade.com>:
Still tangential, but it seems to me that this discussion has exposed a couple of areas where the logging interface is less than ideal: - The introspection of the format string to delay computing certain items (Eric's suggestion may be an improvement here). - The "call str() on any non-string object to get a format string" API (which precludes string subclasses). I suspect other APIs will exist with similar issues once the whole question of supporting multiple format syntaxes gets wider publicity... Paul.

Paul Moore wrote:
Calling str on non-string objects to get a format string does not (prima-facie) preclude string subclasses:
Michael
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog

Paul Moore <p.f.moore <at> gmail.com> writes:
Yes, but that's an implementation detail and not part of the logging interface. It can be changed without any particular additional impact on user code - when I say "additional" I mean apart from the need to change the format strings to {} format, which they would have to do anyway at some point.
- The "call str() on any non-string object to get a format string" API (which precludes string subclasses).
It doesn't preclude string subclasses: it just calls str() on an arbitrary message object to get the string representation for that object. The return value is used to interpolate into the formatted output, and that's all. So I don't understand what's being precluded and how - please elaborate. Thanks & regards, Vinay Sajip

On Thu, Oct 1, 2009 at 14:54, Nick Coghlan <ncoghlan@gmail.com> wrote:
I agree. I view them more as a band-aid over APIs that only accept % formatting but the user of the library wants to use {} formatting.
Well, you can override the methods on str to always return the proper thing, e.g. ``def __str__(self): return self``. Do the same for __add__() and all other methods on strings that return a string themselves. It should be possible to prevent Python code from stripping off the class.
Nope, they don't and people would need to be warned against this.
Having a PEP that lays out how we think people should consider transitioning their code would be good. -Brett

On Thu, Oct 1, 2009 at 11:03 AM, Brett Cannon <brett@python.org> wrote:
I see how this could allow a user to supply a {}-format string to an API that accepts only %-format strings. But I still don't see the transition strategy for the API itself. That is, how does the %-format API use this to eventually switch to {}-format strings? Could someone please lay it out for me, step by step, showing what happens in each version? Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

On Oct 1, 2009, at 6:19 PM, Steven Bethard wrote:
Here's what I said in my first message, suggesting this change. Copy&pasted below: I wrote:
So do (1) in 3.2. Then do (2) in 3.4, and (3) in 3.6. I skipped two versions each time because of how widely this API is used, and the likely pain that doing the transition quickly would cause. But I guess you *could* do it in one version each step. James

On Thu, Oct 1, 2009 at 15:19, Steven Bethard <steven.bethard@gmail.com> wrote:
First off, a wrapper like this I think is a temporary solution for libraries that do not have any transition strategy, not a replacement for one that is thought out (e.g. using a flag when appropriate). With that said, you could transition by: 1. Nothing changes as hopefully the wrapper works fine (as people are pointing out, though, my approach needs to override __str__() to return 'self', else the str type will just return what it has internally in its buffer). 2. Raise a deprecation warning when ``isinstance(ob, brace_fmt)`` is false. When a class is passed in that is a subclass of brace_fmt, call ob.format() on it. 3. Require the subclass. 4. Remove the requirement and always call ob.format(). -Brett

On Thu, Oct 1, 2009 at 4:35 PM, Brett Cannon <brett@python.org> wrote:
Thanks Brett, that's clear. So you save one version over the proposal of adding a format= flag to the API. On Thu, Oct 1, 2009 at 4:13 PM, James Y Knight <foom@fuhm.net> wrote:
I didn't understand how you wanted to apply your suggestion to an API (instead of str.__mod__) the first time and I still don't understand it. Is what Brett has proposed the same thing? Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

Vinay Sajip wrote:
It's also difficult for the subclass to prevent this without creating an infinite loop... (I only spent about 10 minutes looking into it the other day, but that's what happened in all of my naive attempts at doing it in pure Python code). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Oct 1, 2009, at 9:11 AM, Paul Moore wrote:
I'd rather make that: class BraceFormatter: def __init__(self, s): self.s = s def __mod__(self, other): # Needs more magic here to cope with dict argument return s.format(*other) __ = BraceFormatter That is, *not* a string subclass. Then if someone attempts to mangle it, or use it for anything but %, it fails loudly. James

Raymond Hettinger <python <at> rcn.com> writes:
It looks like the BraceMessage would have to re-instantiate on every invocation.
True, because the arguments to the instantiation are kept around as a BraceMessage instance until the time comes to actually format the message (which might be never). Since typically in performance-sensitive code, the isEnabledFor pattern is used to avoid doing unnecessary work, as in if logger.isEnabledFor(logging.DEBUG): logger.debug(__("The {0} is {1}", "answer", 42)) The BraceMessage invocation overhead is only incurred when needed, as is the cost of computing the additional arguments. As I understand it {}-formatting is slower than %-formatting anyway, and if this pattern is used only for {}-formatting, then there will be no additional overhead for %-formatting and some additional overhead for {}-formatting. I'm not sure what that instantiation cost will be relative to the overall time for an "average" call - whatever that is ;-) - though. Other approaches to avoid instantiation could be considered: for example, making __ a callable which remembers previous calls and caches instances keyed by the call arguments. But this will incur memory overhead and some processing overhead and I'm not sure if it really buys you enough to warrant doing it. Regards, Vinay Sajip

On Sep 30, 2009, at 1:01 PM, Antoine Pitrou wrote:
It's a nice idea -- but I think it's better for the wrapper (whatever form it takes) to support __mod__ so that logging.Formatter (and everything else) doesn't need to be modified to be able to know about how to use both callables and "%"ables. Is it possible for a C function like str.format to have other methods defined on its function type? James

Martin v. Löwis wrote:
It's a maintenance burden. There are several outstanding bugs with it, admittedly not of any great significance. I've been putting time into fixing at least one of them. When Mark and I did short-float-repr, at least half of my time was consumed with %-formatting, mostly because of how it does memory management. On the plus side, %-formatting is (and always will be) faster than str.format(). Its very limitations make it possible for it to be fast. I'd note that PEP 3101 calls str.format() a replacement for %-formatting, not an alternate mechanism to achieve the same end.
Having a converter and guessing are 2 distinct issues. I believe a convert from %-formatting specification strings to str.format() strings is possible. I point this out not because I think a converter solves this problem, but because in the past there's been a debate on whether a converter is even possible. Eric.

Well - that's the cost of keeping it in the language. It's not a problem with using it while it *is* in the language. So if a decision was made to eventually remove % formatting, it would be reasonable to start migrating code to PEP 3101. However, no such decision has been made (and hopefully won't be throughout 3.x), so as the mechanism *is* available, there is no need to start changing existing code (except the for actual issue Steven discusses, namely libraries that expect strings in % template form).
I'd note that PEP 3101 calls str.format() a replacement for %-formatting, not an alternate mechanism to achieve the same end.
I think this is a mis-wording; the intent of the PEP apparently is to propose this mechanism as an option, not as an actual replacement. This becomes clear when reading the "Backwards Compatibility" section: # Backwards compatibility can be maintained by leaving the existing # mechanisms in place. The new system does not collide with any of # the method names of the existing string formatting techniques, so # both systems can co-exist until it comes time to deprecate the # older system. Regards, Martin

On Wed, Sep 30, 2009 at 9:48 AM, "Martin v. Löwis" <martin@v.loewis.de> wrote:
The problem is, PEP 3101 and our interpretation of it evolved. The original proposal for {}-formatting was certainly put forward with the aim to completely *replace* %-formatting, and care was taken in the design to cover all use cases, avoid known problems, etc. Then we started looking seriously at conversion from Python 2 to Python 3 and we discovered that converting %-formatting to {}-formatting was a huge can of worms, and decided it wasn't worth to try and do *at the time* given the Python 3 schedule. We considered some kind of gentle deprecation warning, but decided that even that would be too noisy. So now we have two competing mechanisms. In the long run, say Python 4, I think we don't need both, and we should get rid of one. My preference is still getting rid of %-formatting, due to the problems with it that prompted the design of {}-formatting (no need to reiterate the list here). So how do we get there? My proposal would be to let this be a gradual take-over of a new, superior species in the same niche as an older species. (Say, modern man over Neanderthal man.) Thus, as new code is written (especially example code, which will be copied widely), we should start using {}-formatting, and when new APIs are designed that tie in to some kind of formatting, they should use {}-formatting. Adding support for {}-formatting, in addition to %-formatting, to existing APIs like the logging package also strikes me as a good idea, as long as backwards compatibility can be preserved. (I have no strong ideas on how to do this right now.) If we do this right, by the time Python 4 comes around, {}-formatting will have won the race, and there won't be a question about removing %-formatting at the time. I wouldn't be surprised if by then static analysis techniques will have improved so that we *can* consider automatic conversion by then. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Eric Smith wrote:
I agree with Martin. Both approaches have their ups and downs, but forcing users to move from %-formatting to .format()-formatting will just frustrate them: having to convert several thousand such (working) uses in their code with absolutely no benefit simply doesn't look like a good way to spend your time. In addition to the code changes, such a move would also render existing translations of the %-formatted string templates useless.
Why not allow both and use .format() for those cases where %-formatting doesn't work too well ?
I think that's a wording we should change. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 01 2009)
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On 1 Oct 2009, at 10:37, M.-A. Lemburg wrote:
I agree you cannot force the move to {} format. There are programs that expose the %(name)s in user interfaces for customisation.
In addition to the code changes, such a move would also render existing translations of the %-formatted string templates useless.
Speaking of translation support has xgettext been updated to support {}? It is a life saver to have xgettext report that "This %s and %s" is not translatable. Barry

On Sep 29, 2009, at 11:15 PM, Martin v. Löwis wrote:
Although I hate the name 'dicttemplate', this seems like the best solution to me. Maybe it's good that 'dicttemplate' is so ugly though so that people will naturally prefer 'format' :). But I like this because there's not really any magic, it's explicit, and the decision is made by the coder at the call site. The implementation does not need to guess at all. If this is adopted, it should become a common idiom across Python so that once you've learned how to transition between the format strings, you pretty much know how to do it for any supporting API. So we should adopt it across all of the standard library. -Barry

On Wed, Sep 30, 2009 at 5:21 AM, Barry Warsaw <barry@python.org> wrote:
Could you comment on what you think we should do when the parameter is not positional? As I mentioned upthread, in the case of logging.Formatter, it's already documented as taking the keyword parameter "fmt", so we'd have to use the name "fmt" for % formatting. Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

On Sep 30, 2009, at 11:22 AM, Steven Bethard wrote:
I'm okay with fmt==%-formatting and format=={}-formatting, but I'd also be okay with transitioning 'fmt' to 'dicttemplate' or whatever. I think the important thing is to be explicit in the method signature which one you want (secondary would be trying to standardize this across the stdlib). -Barry

On Wed, Sep 30, 2009 at 8:31 AM, Barry Warsaw <barry@python.org> wrote:
Thanks for the clarification. I generally like this approach, though it's not so convenient for argparse which already takes format strings like this:: parser = ArgumentParser(usage='%(prog)s [--foo]') parser.add_argument( '--foo', type=int, default=42, help='A foo of type %(type)s, defaulting to %(42)s) That is, existing keyword arguments that already have good names (and are pretty much always used as keyword arguments) take format strings. I'm not sure that changing the name of usage= or help= here is really an option. I guess in this case I'm stuck with something like Benjamin's suggestion of adding an additional flag to control which type of formatting, and the corresponding 4 versions of cleanup. Ew. Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

On Sep 30, 2009, at 11:39 AM, Steven Bethard wrote:
Ah right.
I missed Benjamin's suggestion, but in this case I would say add a flag to ArgumentParser. I'm either going to want {} formatting all or nothing. E.g. import argparse parser = ArgumentParser(usage='{prog} [--foo]', format=argparse.BRACES) parser.add_argument( '--foo', type=int, default=42, help='A foo of type {type}, defaulting to {42}') (although that last looks weird ;). -Barry

On Wed, Sep 30, 2009 at 8:50 AM, Barry Warsaw <barry@python.org> wrote:
Yep, sorry, typo, that should have been %(default)s, not %(42)s.
Yeah, that's basically Benjamin's suggestion, with the transition path being: (1) Introduce format= keyword argument, defaulting to PERCENTS (2) Deprecate format=PERCENTS (3) Error on format=PERCENTS (Benjamin suggested just changing the default here, but that would give a release where errors would pass silently) (4) Deprecate format= keyword argument. (5) Remove format= keyword argument. It's a little sad that it takes 5 versions to do this, but I guess if a user is on top of things, at version (1) they add format=BRACES to all their code, and then remove those at version (4). So even though there are 5 versions, there are only two code changes required. At least in the case of argparse, this can be a constructor argument as you suggest, and we only have to introduce this flag in one place. Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

Unless there is a firm decision to switch to kill %-formatting across the board, I don't think anything should be done at all. Creating Py3.x was all about removing cruft and clutter. I don't think it would be improved by adding two ways to do it for everything in the standard library. That is a lot of additional code, API expansion, and new testing, fatter docs, and extra maintenance, but giving us no new functionality. Anytime we start hearing about newstyle/oldstyle combinations, I think a flag should go up. Anytime there is a proposal to make sweeping additions that do not add new capabilities, a flag should go up. I understand the desire to have all formatting support both ways, but I don't think it is worth the costs. People *never* need both ways though they may have differing preferences about which *one* to use. my-two-cents, Raymond

Unfortunately, as Steven pointed out, the parameter is *already* documented with the name "fmt". So one option would be to call it "fmt" and "format"; the other option would be to not only deprecate the positional passing, but also the passing under the name fmt=. As for calling it "dicttemplate" - I'm sure people can and will propose alternative spellings :-) Regards, Martin

Steven Bethard <steven.bethard <at> gmail.com> writes:
In logging at least, there are two different places where the formatting issue crops up. The first is creating the "message" part of the the logging event, which is made up of a format string and arguments. The second is the one Steven's mentioned: formatting the message along with other event data such as time of occurrence, level, logger name etc. into the final text which is output. Support for both % and {} forms in logging would need to be considered in these two places. I sort of liked Martin's proposal about using different keyword arguments, but apart from the ugliness of "dicttemplate" and the fact that "fmt" is already used in Formatter.__init__ as a keyword argument, it's possible that two different keyword arguments "fmt" and "format" both referring to format strings might be confusing to some users. Benjamin's suggestion of providing a flag to Formatter seems slightly better, as it doesn't change what existing positional or keyword parameters do, and just adds an additional, optional parameter which can start off with a default of False and transition to a default of True. However, AFAICT these approaches only cover the second area where formatting options are chosen - not the creation of the message from the parameters passed to the logging call itself. Of course one can pass arbitrary objects as messages which contain their own formatting logic. This has been possible since the very first release but I'm not sure that it's widely used, as it's usually easier to pass strings. So instead of passing a string and arguments such as logger.debug("The %s is %d", "answer", 42) one can currently pass, for a fictitious class PercentMessage, logger.debug(PercentMessage("The %s is %d", "answer", 42)) and when the time comes to obtain the formatted message, LogRecord.getMessage calls str() on the PercentMessage instance, whose __str__ will use %-formatting to get the actual message. Of course, one can also do for example logger.debug(BraceMessage("The {} is {}", "answer", 42)) where the __str__() method on the BraceMessage will do {} formatting. Of course, I'm not suggesting we actually use the names PercentMessage and BraceMessage, I've just used them there for clarity. Also, although Raymond has pointed out that it seems likely that no one ever needs *both* types of format string, what about the case where application A depends on libraries B and C, and they don't all share the same preferences regarding which format style to use? ISTM no-one's brought this up yet, but it seems to me like a real issue. It would certainly appear to preclude any approach that configured a logging-wide or logger-wide flag to determine how to interpret the format string. Another potential issue is where logging events are pickled and sent over sockets to be finally formatted and output on different machines. What if a sending machine has a recent version of Python, which supports {} formatting, but a receiving machine doesn't? It seems that at the very least, it would require a change to SocketHandler and DatagramHandler to format the "message" part into the LogRecord before pickling and sending. While making this change is simple, it represents a potential backwards-incompatible problem for users who have defined their own handlers for doing something similar. Apart from thinking through the above issues, the actual formatting only happens in two locations - LogRecord.getMessage and Formatter.format - so making the code do either %- or {} formatting would be simple, as long as it knows which of % and {} to pick. Does it seems too onerous to expect people to pass an additional "use_format" keyword argument with every logging call to indicate how to interpret the message format string? Or does the PercentMessage/BraceMessage type approach have any mileage? What do y'all think? Regards, Vinay Sajip

On Wed, Sep 30, 2009 at 16:03, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
I personally prefer the keyword argument approach to act as a flag, but that's me. As for the PercentMessage/BraceMessage, I would make sure that you just simply take the string format and simply apply the arguments later to cut down on the amount of parentheses butting up against each other: ``logger.debug(BraceMessage("The {} is {}"), "answer", 42)``. It's still an acceptable solution that provides a clear transition: simply provide the two classes, deprecate PercentMessage or bare string usage, require BraceMessage, remove requirement. This wrapper approach also provides a way for libraries that have not shifted over to still work with PEP 3101 strings by letting the user wrap the string to be interpolated themselves and then to pass it in to the libraries. It's just unfortunate that any transition would have this cost of wrapping all strings for a while. I suspect most people will simply import the wrapping class and give it some short name like people do with gettext. -Brett

Brett Cannon <brett <at> python.org> writes:
The problem with that is that BraceMessage.__str__() wouldn't know what arguments to use to produce the message.
Yes, logger.debug(__("The {} is {}", "answer", 42)) isn't ideal but perhaps liveable with. And hopefully with a decent editor, the paren-butting annoyance will be minimized. Regards, Vinay Sajip

Antoine Pitrou wrote:
As someone who likes .format() and who already uses such bound methods to print, such as in emsg = "...".format ... if c: print(emsg(arg, barg)) I find this **MUCH** preferable to the ugly and seemingly unnecessary wrapper class idea being bandied about. This would be scarcely worse than passing the string itself. Terry Jan Reedy

On Thu, Oct 1, 2009 at 10:49 PM, Terry Reedy <tjreedy@udel.edu> wrote:
But it's not much of a transition plan. Or are you suggesting: (1) Make API accept callables (2) Issue warnings for regular strings (3) Throw exceptions for regular strings (4) Allow regular strings again, but assume {}-style formatting ? Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

Steven Bethard <steven.bethard <at> gmail.com> writes:
But it's not much of a transition plan. Or are you suggesting:
The question is why we want a transition plan that will bother everyone with no tangible benefits for the user. Regards Antoine.

Has anyone considered the idea of having the string % operator behave intelligently according to the contents of the format string? If it contains one or more valid %-formats, use old-style formatting; if it contains one or more valid {}-formats, use new-style formatting. Ambiguous cases could arise, of course, but hopefully they will be fairly rare, and raising an exception would point out the problem and allow it to be fixed. -- Greg

On Fri, Oct 2, 2009 at 6:29 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Hm... The % operator already does too much guessing: if the string contains exactly one %-format, the argument may be either a size-1 tuple or a non-tuple, otherwise it has to be a size-N tuple, except if the %-formats use the %(name)X form, then the argument must always be a dict. It doesn't sound to me as if adding more guesswork is going to improve its reliability. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

On Fri, Oct 2, 2009 at 2:34 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
I think Guido expressed my feelings pretty well: On Wed, Sep 30, 2009 at 10:37 AM, Guido van Rossum <guido@python.org> wrote:
I agree with this 100% but I can't see it working unless we have some sort of transition plan. Just saying "ok, switch your format strings from % to {}" didn't work in Python 3.0 for various good reasons, and I can't imagine it will work in Python 4.0 unless we have a transition plan. Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

[Steven Bethard]
Do the users get any say in this? I imagine that some people are heavily invested in %-formatting. Because there has been limited uptake on {}-formatting (afaict), we still have limited experience with knowing that it is actually better, less error-prone, easier to learn/rember, etc. Outside a handful of people on this list, I have yet to see anyone adopt it as the preferred syntax. Raymond

Raymond Hettinger <python <at> rcn.com> writes:
It is known to be quite slower. The following timings are on the py3k branch: - with positional arguments: $ ./python -m timeit -s "s='%s %s'; t = ('hello', 'world')" "s % t" 1000000 loops, best of 3: 0.313 usec per loop $ ./python -m timeit -s "f='{} {}'.format; t = ('hello', 'world')" "f(*t)" 1000000 loops, best of 3: 0.572 usec per loop - with named arguments: $ ./python -m timeit -s "s='%(a)s %(b)s'; d = dict(a='hello', b='world')" "s % d" 1000000 loops, best of 3: 0.387 usec per loop $ ./python -m timeit -s "f='{a} {b}'.format; d = dict(a='hello', b='world')" "f(**d)" 1000000 loops, best of 3: 0.581 usec per loop Regards Antoine.

"Raymond Hettinger" <python@rcn.com> writes:
I'm a user! :-) I hate calling methods on string literals, I think it looks very odd to have code like this: "Displaying {0} of {1} revisions".format(x, y) Will we be able to write this as "Displaying {0} of {1} revisions" % (x, y) too?
I've skimmed over the PEP, and the new {}-syntax seems to have some nice features. But I've not seen it used anywhere yet. -- Martin Geisler VIFF (Virtual Ideal Functionality Framework) brings easy and efficient SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.

<delurk> Rami Chowdhury posted this to a mailing list; I've been using it (perhaps unintentionally promoting it) as part of non-English, non-ASCII font outreach:
As a user, my assumption was {} was going forward, rain or shine, and everyone should be on board by Python 3.2. (I thought once the Talin PEP got approved, that was it). I wrote Steven Bethard privately about this. Sorry for the intrusion. </delurk>

What about using string prefix 'f'? f"{foo} and {bar}" % something == "{foo} and {bar}.format(something) s = f"{foo}" t = "%(bar)s" s + t # raises Exception Transition plan: n: Just add F prefix. And adding "format_string" in future. n+1: deprecate __mod__() without 'F'. n+2: libraries use .format() and deprecate __mod__() with 'F' n+3: remove __mod__() -- Naoki INADA <songofacandy@gmail.com>

Carl Trachte wrote:
I've skimmed over the PEP, and the new {}-syntax seems to have some nice features. But I've not seen it used anywhere yet.
I am using it with 3.1 in an unreleased book I am still writing, and will in any code I publish.
Autonumbering, added in 3.1, makes '{}' as easy to write for simple cases as '%s'. That was one complaint about the original 3.0 version. Another was and still is the lack of conversion, which is being worked on. (I thought once the Talin
tjr

On Fri, Oct 2, 2009 at 12:43 PM, Martin Geisler <mg@lazybytes.net> wrote:
I doubt it. One of the major complaints about the %-style formatting was that the use of % produced (somewhat) unexpected errors because of how operator precedence works::
Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

Steven Bethard wrote:
The other major problem with the use of the mod operator is the bugs encountered with "fmt % obj" when obj happened to be a tuple or a dict. So no, the switch to a method rather than an operator was deliberate. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Fri, Oct 2, 2009 at 11:56 AM, Raymond Hettinger <python@rcn.com> wrote:
Sure, I guess this is a possibility too, and it could make the transition process I have to work through for argparse much easier. ;-) To be clear, are you suggesting that APIs that currently support only %-formatting shouldn't bother supporting {}-formatting at all? Or are you suggesting that they should support both, but support for %-formatting should never go away? Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

Raymond Hettinger wrote:
A self-fulfilling prophecy if ever I heard one... uptake is limited because there's a large legacy code base that doesn't use it and many APIs don't support it, so we shouldn't bother trying to increase the number of APIs that *do* support it? I'm starting to think that a converter between the two format mini-languages may be the way to go though. fmt_braces is meant to provide a superset of the capabilites of fmt_percent, so a forward converter shouldn't be too hard. A reverse converter may have to punt with ValueError when it finds things that cannot be expressed in the fmt_percent mini language though. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Nick Coghlan <ncoghlan <at> gmail.com> writes:
I've done a first cut of a forward (% -> {}) converter: http://gist.github.com/200936 but I'm not sure there's a case for a converter in the reverse direction, if we're encouraging movement in one particular direction. Regards, Vinay Sajip

Vinay Sajip wrote:
It would allow an application to still use brace formatting throughout even if one particularly library only accepted percent formatting. Probably not worth the effort at this point though, as if we can get a reliable forward converter happening then it may become possible for APIs to correctly guess which kind of format string they have been passed. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Oct 2, 2009, at 2:56 PM, Raymond Hettinger wrote:
Well, I actually think it was a pretty bad idea to introduce {} formatting, because %-formatting is well-known in many other languages, and $-formatting is used by basically all the rest. So the introduction of {}-formatting has always seemed silly to me, and I wish it had not happened. HOWEVER, much worse than having a new, different, and strange formatting convention is having *multiple* formatting conventions arbitrarily used in different places within the language, with no rhyme or reason. So, given that brace-formatting was added, and that it's been declared the way forward, I'd *greatly* prefer it taking over everywhere in python, instead of having to use a mixture. James

That doesn't mean we have to have a transition plan *now*. Creating one after Python 3.5 is released (i.e. in 2015 or so) might be sufficient. To create a transition plan, you first need *consensus* that you actually do want to transition. I don't think such consensus is available, and might not be available for a few more years. Pushing the issue probably delays obtaining consensus. Regards, Martin

Martin v. Löwis wrote:
Agreed, but that doesn't rule out discussions of what can be done to make such a transition easier. And just as 2to3 makes the overall Python transition practical, a percent to brace format translator should make an eventual formatting transition feasible. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Steven Bethard wrote:
It is a 'plan' to transition from not being able to use the new formatting, which I prefer, throughout the stdlib, to being able to do so. I believe most, even if not all, find that acceptable. Certainly, I think you should be able to implement the above for argparse before submitting it. And I would hope that 3.2, in a year, is generally .format usable. This is the first step in a possible long-term replacement, but there is currently no consensus to do any more than this. So I think it premature to do any more. I would agree, for instance, that an auto-translation tool is needed. Terry Jan Reedy

2009/10/3 Brett Cannon <brett@python.org>:
I've already started a converter. It's here: https://code.launchpad.net/~gutworth/+junk/mod2format -- Regards, Benjamin

Brett Cannon <brett <at> python.org> writes:
I've done a first cut of a converter from %-format to {}-format strings. I'm not sure where you want to put it in the sandbox, I've created a gist on GitHub: http://gist.github.com/200936 Not thoroughly tested, but runs in interactive mode so you can try things out. All feedback appreciated! Regards, Vinay

Raymond Hettinger <python <at> rcn.com> writes:
We should get one written. ISTM, every %-formatting string is directly translatable to an equivalent {}-formatting string.
I've made a start, but I'm not sure how best to handle the '#' and ' ' conversion flags. Regards, Vinay Sajip

Raymond Hettinger <python <at> rcn.com> writes:
We should get one written. ISTM, every %-formatting string is directly translatable to an equivalent {}-formatting string.
I'm not sure you can always get equivalent output from the formatting, though. For example:
Someone please tell me if there's a better {}-format string which I've missed! Regards, Vinay Sajip

MRAB <python <at> mrabarnett.plus.com> writes:
"{0:#08x}".format(0x1234) '0x001234'
Good call, but here's another case:
"%#o" % 0x1234 '011064'
I don't see how to automatically convert the "%#o" spec, though of course there are ways of fudging it. The obvious conversion doesn't give the same value:
"{0:#o}".format(0x1234) '0o11064'
I couldn't see a backward-compatibility mode for str.format generating just a leading "0" for octal alternative format (the C style) as opposed to "0o". Regards, Vinay Sajip

Vinay Sajip <vinay_sajip <at> yahoo.co.uk> writes:
Apart from the sheer unreadability of the {}-style format string, the result looks rather unexpected from a human being's point of view. (in those situations, I would output the 0x manually anyway, such as:
Regards Antoine.

Antoine Pitrou <solipsis <at> pitrou.net> writes:
Well of course, but I asked the question in the context of providing an *automatic* converter from %-format strings to {}-format. At the moment, it doesn't seem like a 100%-faithful automatic conversion is feasible. Regards, Vinay Sajip

Antoine Pitrou wrote:
"#" formatting was added to int.__format__ in order to support this case:
format(10, '#6x') ' 0xa'
Without '#', there's no way to specify a field width but still have the '0x' up against the digits (at least not without generating an intermediate result and figuring out the width manually). The fact that it works in combination with '0' or '>' (not sure which one makes it unreadable to you) wasn't really the point of the feature. Eric.

Antoine Pitrou wrote:
The percent format string is pretty unreadable too - you're just more used to it, so it doesn't look as weird :) Vinay's problem above is due to using the wrong alignment flag: ">", which says to right align everything, instead of "=", which says to left align the sign character and the numeric prefix with the fill character inserted in the middle. In this particular case he could also use the zero-padding shortcut which leaves out the alignment flag altogether (and implies a "0=" alignment format). That is (using 2.6/3.1):
Adding in the sign bit gives the following translations:
Note that ">" alignment is actually now *broken* on trunk and py3k, since ">" and "=" are now behaving exactly the same instead of the former right aligning the entire number including the sign bit and prefix:
(bug assigned to Eric: http://bugs.python.org/issue7081) Note that, since percent formatting doesn't allow specification of the fill characters or the field alignment, translations should probably rely on the simple field width specifier, optionally selecting zero padding by preceding it with a zero. It should never be necessary to use the full alignment spec for translated formats. The other thing to keep in mind is that brace formatting is fussier about the order of things - items *must* appear in the order they are listed in PEP 3101 (i.e. if wanting a zero padded field with leading sign and numeric prefix, you must write "+#0"). Percent format, on the other hand, allows the "#", "+" and "0" to be placed in any order you like (although they must appear before the field width definition, precision specifier and type code). As far as I can see, that leaves the prefixing of octal numbers ("0o" vs "0") as the only true incompatibility between percent formatting and brace formatting, and even for those the incompatibility is limited to cases where a field width is specified without leading zeroes or a sign character is specified. In other cases, the translation can just stick a leading literal "0" in front of the field in the brace formatting string. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Nick Coghlan <ncoghlan <at> gmail.com> writes:
[snip]
Helpful analysis there, Nick, thanks. Bonzer ;-) There's also the corner case of things like %#.0f which, when asked to format 3e100, will print 3.e+100 whereas the translated format {0:.0f}, will print 3e+100 for the same value. BTW I sent Eric a private mail re. the "0o" versus "0" issue, to see if it was worth raising an enhancement request on the bug tracker using "O" to generate compatible rendering for octals. Regards, Vinay Sajip

Vinay Sajip wrote:
I didn't get your message, could you resend?. I was thinking the same thing, but it seems like a transition step. I'd rather not keep such backward compatibility hacks (if you will) around for the long haul. How about a flag (maybe '*') at the start of the format specification which says "operate in backward compatibility mode"? We could document it as being only useful for the % to {} translator, and promise to remove it at some point in the future. Either actually deprecate it or just promise to deprecate it in the future. Eric.

Eric Smith wrote at Thu, 08 Oct 2009 10:24:33 -0400:
That doesn't seem very useful to me. IIUC, the point of the translator is to allow porting of the millions of existing %-formating operations to the new-style .format. If the result of that is deprecated or removed a few years from now, all maintainers of long existing code have exactly the same problem. IMHO, either the translation is done once and gives identical output or it isn't worth doing at all. -- Christian Tanzer http://www.c-tanzer.at/

Christian Tanzer wrote:
I was thinking of it as a transition step until all application code switched to {} formatting. In which case the application has to deal with it.
IMHO, either the translation is done once and gives identical output or it isn't worth doing at all.
I disagree. I doubt even 0.001% of all format strings involve octal formatting. Is it really worth not providing a transition path if it can't cover this case? Eric.

Benjamin Peterson wrote:
That works so long as the original format string doesn't specify either a space padded field width or else a sign character. For those the extra zero needs to be inserted after the leading characters but before the number, so the formatting engine really has to handle it. I'm actually thinking that having the ability to specify a single 0 as the leading character for octal output is a legitimate feature. There are plenty of other tools out there that use a single leading zero to denote octal numbers (e.g. think of a Python script that generated C code), so having Python be able to produce such numbers makes a lot of sense. Vinay's suggestion of using 'O' instead of 'o' to denote C-style octal formatting instead of Python-style sounds reasonable to me (similar in degree to the upper vs lower case distinction for 'x' and 'X' hex formatting). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Nick Coghlan wrote:
Mark points out in http://bugs.python.org/issue7094 that we'd also need to add alternate float formatting for any automated translation facility to work flawlessly. There might be other float issues involving trailing decimals with no zeros that work differently, too. Eric.

Eric Smith wrote at Thu, 08 Oct 2009 10:55:21 -0400:
You lost me here. All that talk of deprecating %-formatting makes me really nervous. %-formatting is pervasive in all existing Python code. Without an automatic translator that is 100% accurate, porting all that code to {}-formatting is not possible. Heck, it's not even possible to grep for all instances of %-formatting. How do you suppose that maintainers could ever do the transition from %- to {}-formatting manually?
If %-formatting is first deprecated then removed from Python and there is no automatic transition path that effectively means that existing code using %-formatting is forced to stay at whatever Python version was the last one supporting %-formatting. I surely hope nobody is seriously considering such a scenario. Perl 6 seems harmless in comparison. -- Christian Tanzer http://www.c-tanzer.at/

Christian Tanzer wrote:
That is vastly overstating it. Making 'with' and 'as' keywords and removing string exceptions (which have already happened) will affect far more programs than a minor incompatibility in transitioning string formatting. Michael

Michael Foord wrote at Thu, 08 Oct 2009 16:56:35 +0100:
`with` and `as` are trivial to fix and certainly not pervasive in existing code. String exceptions have been deprecated for years. -- Christian Tanzer http://www.c-tanzer.at/

On Thu, Oct 8, 2009 at 8:08 AM, Christian Tanzer <tanzer@swing.co.at> wrote:
This is pretty much the situation with integer division (you can only recognize it by running the code), and yet we figured a way to change that in 3.x. Or take classic classes vs. new-style classes. They cannot be translated 100% automatically either. The solution is to support the old and new style in parallel for a really long time -- we did this with int division (read PEP 238), we did it with classes, and we can do it again with formatting. Unless I missed something, we're not planning to remove %-formatting until Python 4.0 comes along, which we won't even start until a long time after everyone has switched to some version of 3.x. So the same approach will apply: support both forms, nudge people to start using the new form, wait, nudge some more, etc. So, yes, we will continue to make noise about this. And yes you should opportunistically migrate your code to {}-formatting, like when you're rewriting some code anyway. One of the nice things about {}-formatting is that in most cases (things like the logging API excluded) you can change it one format string at a time. And no, the sky isn't falling. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

[Christian Tanzer]
How do you suppose that maintainers could ever do the transition from %- to {}-formatting manually?
[Guido van Rossum]
This is pretty much the situation with integer division (you can only recognize it by running the code),
Do you think there may be some possible parallel to the -3 option to flag cases of %-formatting? If so, that could be helpful.
I've already have some code that mixes the styles (using {} for new stuff). Raymond

On Thu, Oct 8, 2009 at 10:14 AM, Raymond Hettinger <python@rcn.com> wrote:
Do you think there may be some possible parallel to the -3 option to flag cases of %-formatting? If so, that could be helpful.
Absolutely. This should be simple, since there's just one or two places where to place the warning. We might also automatically turn it on when Python 2.7 is run with -3. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

I didn't get your message, could you resend?.
Resent, it may have been stopped by your spam filters since it came from my vinay-underscore-sajip-at-red-hyphen-dove-dot-com address. The subject was "Python str.format() and octal formatting compatibility".
I don't much mind matter exactly which mechanism we use to distinguish between 0o and 0 prefixes, as long as it's one most people are happy with :-) Regards, Vinay Sajip

On approximately 10/8/2009 7:24 AM, came the following characters from the keyboard of Eric Smith:
Seems like the ability for Python {} formatting to be able to match not only old Python % formatting output, but also output created by C's sprintf, and other numeric formatting systems, make this particular feature useful in more scenarios than a "backward compatibility hack". If you want to replace a C program that produces parsed output in a given format, and that given format includes leading-0-octal numbers, then it would be good to have the capability in Python .format, even though Python itself uses 0o prefix. Similar arguments may apply anywhere else that sprintf produces something that .format cannot currently produce. -- Glenn -- http://nevcal.com/ =========================== A protocol is complete when there is nothing left to remove. -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

On approximately 9/30/2009 4:03 PM, came the following characters from the keyboard of Vinay Sajip:
It seems to me that most of the discussion of in this thread is concerned with the first issue... and yet I see the second as the harder issue, and it has gotten less press. Abstracting this away from logger, I think the problem has three cases: 1) Both the format message and all the parameters are supplied in a single API call. This is really a foolish API, because def API( fmt, p1, p2, p3 ): str = fmt % (p1, p2, p3) could have just as easily been documented originally as def API( str ): where the user is welcome to supply a string such as API( fmt % (p1, p2, p3 )) and if done this way, the conversion to .format is obvious... and all under the users control. 2) The format message and the parameters are supplied to separate APIs, because the format message is common to many invocations of the other APIs that supply parameters, and is cached by the API. This is sufficient to break the foolishness of #1, but is really just a subset of #3, so any solutions to #3 apply here. 3) The format message and the parameters for it may be supplied by the same or separate APIs, but one or both are incomplete, and are augmented by the API. In other words, one or both of the following cases: 3a) The user supplied format message may include references to named parameters that are documented by the API, and supplied by the API, rather than by the user. 3b) The user supplied format string may be embedded into a larger format string by the API, which contains references to other values that the user must also supply. In either case of 3a or 3b, the user has insufficient information to perform the whole format operation and pass the result to the API. In both cases, the API that accepts the format string must be informed whether it is a % or {} string, somehow. This could be supplied to the API that accepts the string, or to some other related API that sets a format mode. Internally, the code would have to be able to manipulate both types of formats.
The above three paragraphs are unclear to me. I think they might be referring to case 2 or 3, though.
It seems that the above is only referring to case 1? And doesn't help with case 2 or 3?
Agreed here... a single global state would not make modular upgrades to a complex program easy... the state would be best included with particular instance objects, especially when such instance objects exist already. The format type parameter could be provided to the instance, instead of globally.
These last 3 paragraphs seem to be very related to logger, specifically. The first of the 3 does point out a concern for systems that interoperate across networks: if the format strings and parameters are exposed separately across networks, whatever types are sent must be usable at the receiver, or at least appropriate version control must be required so that incompatible systems can be detected and reported. On approximately 9/30/2009 5:47 PM, came the following characters from the keyboard of Antoine Pitrou:
This "callable" technique seems to only support case 1 and 2, but not 3, unless I misunderstand it. -- Glenn -- http://nevcal.com/ =========================== A protocol is complete when there is nothing left to remove. -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

Glenn Linderman wrote:
The lazy APIs actually make a lot of sense, particularly when there is a chance that the function being called may be able to avoid the formatting call altogether. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Steven Bethard wrote:
I don't agree that we should do that. I see nothing wrong with using % substitution.
We should really provide some sort of transition plan.
-1.
Now, that's different. IIUC, you are not actually performing the substitution here, but only at a later place. So changing to the new formatting mechanism is an API change. I agree that the new placeholder syntax needs to be supported.
It's also ugly because the class has the word "new" in its name, which no class should ever have. In a few years, it would still be around, but not new anymore.
I don't see the point of having a converter. The tricky part, as you say, is the guessing. Whether the implementation then converts the string or has two alternative formatting algorithms is an implementation detail. I would favor continued use of the actual % substitution code. I would propose that the format argument gets an argument name, according to the syntax it is written in. For PEP 3101 format, I would call the argument "format" (like the method name of the string type), i.e. logging.Formatter( format="{asctime} - {name} - {levelname} - {message}") For the % formatting, I suggest "dicttemplate" (assuming that you *have* to use dictionary %(key)s style currently). The positional parameter would also mean dicttemplate, and would be deprecated (eventually requiring a keyword-only parameter). Regards, Martin

On Tue, Sep 29, 2009 at 8:15 PM, "Martin v. Löwis" <martin@v.loewis.de> wrote:
Just to be clear, I don't think logging is the only place these kind of things happen. Some others I found looking around: BaseHTTPServer.BaseHTTPRequestHandler.error_message_format http://docs.python.org/library/basehttpserver.html#BaseHTTPServer.BaseHTTPRe... BaseHTTPServer.BaseHTTPRequestHandler.log_message http://docs.python.org/3.1/library/http.server.html#http.server.BaseHTTPRequ... email.generator.DecodedGenerator http://docs.python.org/library/email.generator.html#email.generator.DecodedG... There may be more.
This is a nice solution for the cases where we can be confident that the parameter is currently only used positionally. However, at least in Python 3.1, "fmt" is already documented as a keyword parameter: http://docs.python.org/3.1/library/logging.html#logging.Formatter I guess we could follow the same approach though, and have fmt= be the %-style formatting, and use some other keyword argument for {}-style formatting. We've got a similar problem for the BaseHTTPRequestHandler.error_message_format attribute. I guess we'd want to introduce some other attribute which is the error message format for the {}-style formatting? Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

I'm resending a message I sent in June, since it seems the same thread has come up again, and I don't believe anybody actually responded (positively or negatively) to the suggestion back then. http://mail.python.org/pipermail/python-dev/2009-June/090176.html On Jun 21, 2009, at 5:40 PM, Eric Smith wrote:
It'd possibly be helpful if there were builtin objects which forced the format style to be either newstyle or oldstyle, independent of whether % or format was called on it. E.g. x = newstyle_formatstr("{} {} {}") x % (1,2,3) == x.format(1,2,3) == "1 2 3" and perhaps, for symmetry: y = oldstyle_formatstr("%s %s %s") y.format(1,2,3) == x % (1,2,3) == "1 2 3" This allows the format string "style" decision is to be made external to the API actually calling the formatting function. Thus, it need not matter as much whether the logging API uses % or .format() internally -- that only affects the *default* behavior when a bare string is passed in. This could allow for a controlled switch towards the new format string format, with a long deprecation period for users to migrate: 1) introduce the above feature, and recommend in docs that people only ever use new-style format strings, wrapping the string in newstyle_formatstr() when necessary for passing to an API which uses % internally. 2) A long time later...deprecate str.__mod__; don't deprecate newstyle_formatstr.__mod__. 3) A while after that (maybe), remove str.__mod__ and replace all calls in Python to % (used as a formatting operator) with .format() so that the default is to use newstyle format strings for all APIs from then on.

James Y Knight wrote:
I must have missed this suggestion when it went past the first time. I certainly like this approach - it has the virtue of only having to solve the problem once, and then application developers can use it to adapt any existing use of %-mod formatting to str.format formatting. Something like: class formatstr(str): def __mod__(self, other): if isinstance(other, dict): return self.format(**dict) if isinstance(other, tuple) return self.format(*other) return self.format(other) APIs that did their own parsing based on %-formatting codes would still break, as would any that explicitly called "str" on the object (or otherwise stripped the subclass away, such as via "'%s' % fmt"), but most things should pass a string subclass through transparently. I wouldn't bother with a deprecation plan for 'normal' %-formatting though. I don't think it is going to be practical to actually get rid of that short of creating Python 4.0. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Tue, Sep 29, 2009 at 10:04 PM, James Y Knight <foom@fuhm.net> wrote:
So I understand how this might help a user to move from %-style formatting to {}-style formatting, but it's still not clear to me how to use this to evolve an API. In particular, if the goal is for the API to move from accepting %-style format strings to {}-style format strings, how should that API use newstyle_formatstr or oldstyle_formatstr to make this transition? Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

2009/9/30 Steven Bethard <steven.bethard@gmail.com>:
IIUC, the API doesn't change. It's just that the internal code goes as follows: 1. (Current) Use %. str and oldformat objects work as now, newformat objects work (using .format). 2. Convert the code to use .format - oldformat and newformat objects work as before, str objects must be in new format. The problem is, there's a point at which bare str arguments change behaviour. So unless people wrap their arguments when calling the API, there's still a point when things break (albeit with a simple workaround available). So maybe we need transition steps - wrap bare str objects in oldformat classes, and warn, then wrap str objects in newformat objects and warn, then stop wrapping. That sounds to me like "normal" usage (bare strings) will be annoyingly painful for a substantial transition period. I'm assuming that the oldformat and newformat classes are intended to be transitional things, and there's no intention that users should be using the wrapper objects always. (And of course better names than "oldformat" and "newformat" are needed - as Martin pointed out, having "old" and "new" in the names is inappropriate). Otherwise, I'm a strong -1 on the whole idea. Paul.

Steven Bethard wrote:
The problem is that many (most?) of the problematic APIs (such as logging) will have multiple users in a given application, so getting the granularity of any behavioural switches right is going to be difficult. Providing a formatstr() type that makes .__mod__() work based on a .format() style string (and a formatmod() type that does the opposite) would allow for extremely fine-grained decision making, since every format string will either be an ordinary str instance or else an instance of the formatting subclass. (Note that the primary use case for both proposed types is an application developer adapting between two otherwise incompatible third party libraries - the choice of class just changes based on whether the old code being adapted is the code invoking mod on a format string or the code providing a format string that expects to be used with the mod operator). I don't see any way for delayed formatting of "normal" strings in any existing API to move away from %-formatting except via a long and painful deprecation process (i.e. first warning when bare strings are used for a release or two, then switching entirely to the new formatting method) or by duplicating the API and maintaining the two interfaces in parallel for the foreseeable future. As Paul noted, the two proposed classes may also be useful to the library developer during such a transition process - they could accept strings in the "wrong" format just by wrapping them appropriately rather than having to maintain the parallel APIs all the way through the software stack. Probably worth letting these concepts bake for a while longer, but it definitely be nice to do *something* to help enable this transition in 2.7/3.2. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Wed, 30 Sep 2009 03:04:05 pm James Y Knight wrote:
People will want this formatstr object to behave like strings, with concatenation, slicing, etc.:
Instead of having to support one type with %-formatting and {}-formatting (str), now the std lib will have two classes with %-formatting and {}-formatting. How is this an improvement? Moving along, let's suppose the newstyle_formatstr is introduced. What's the intention then? Do we go through the std lib and replace every call to (say) somestring % args with newstyle_formatstr(somestring) % args instead? That seems terribly pointless to me -- it does nothing about getting rid of % but adds a layer of indirection which slows down the code. Things are no better if the replacement code is: newstyle_formatstr(somestring).format(*args) (or similar). If we can do that, then why not just go all the way and use this as the replacement instead? somestring.format(*args)
Now we have three classes that support both % and {} formatting. Great. [...]
And how are people supposed to know what the API uses internally? Personally, I think your chances of getting people to write: logging.Formatter(newstyle_formatstr("%(asctime)s - %(name)s - %(level)s - %(msg)s")) instead of logging.Formatter("%(asctime)s - %(name)s - %(level)s - %(msg)s") is slim to none -- especially when the second call still works. You'd better off putting the call to newstyle_formatstr() inside logging.Formatter, and not even telling the users. Instead of wrapping strings in a class that makes .__mod__() and .format() behave the same, at some cost on every call presumably, my preferred approach would be a converter function (perhaps taken from 2to3?) which modified strings like "%(asctime)s" to "{asctime}". That cost gets paid *once*, rather than on every call. (Obviously the details will need to be ironed out, and it will depend on the external API. If the external API depends on the caller using % explicitly, then this approach may not work.)
2) A long time later...deprecate str.__mod__;
How long? I hope that's not until I'm dead and buried. -- Steven D'Aprano

On Sep 30, 2009, at 10:34 AM, Steven D'Aprano wrote:
Indeed, that *would* be terribly pointless! Actually, more than pointless, it would be broken, as you've changed the API from taking oldstyle format strings to newstyle format strings. That is not the suggestion. The intention is to change /nearly nothing/ in the std lib, and yet allow users to use newstyle string substitution with every API. Many Python APIs (e.g. logging) currently take a %-type formatting string. It cannot simply be changed to take a {}-type format string, because of backwards compatibility concerns. Either a new API can be added to every one of those functions/classes, or, a single API can be added to inform those places to use newstyle format strings.
It's documented, (as it already must be, today!).
That's not my proposal. The user could write either: logging.Formatter("%(asctime)s - %(name)s - %(level)s - %(msg)s") (as always -- that can't be changed without a long deprecation period), or: logging.Formatter(newstyle_formatstr("{asctime} - {name} - {level} - {msg}") This despite the fact that logging has not been changed to use {}- style formatting internally. It should continue to call "self._fmt % record.__dict__" for backward compatibility. That's not to say that this proposal would allow no work to be done to check the stdlib for issues. The Logging module presents one: it checks if the format string contains "%{asctime}" to see if it should bother to calculate the time. That of course would need to be changed. Best would be to stick an instance which lazily generates its string representation into the dict. The other APIs mentioned on this thread (BaseHTTPServer, email.generator) will work immediately without changes, however. James

James Y Knight wrote:
allow users to use newstyle string substitution with every API.
However it is done, I think someone (like new Python programmers) should be able to program in Python3, and use everything in the stdlib, without ever learning % formatting -- and that I should be able to forget about it ;-). +10 on the goal. Terry Jan Reedy

[Terry Reedy]
If that were possible, it would be nice. But as long as the language supports %-formatting, it is going to be around in one form or another. Any non-casual user will bump into %-formatting in books, in third-party modules, in ASPN recipes, on the newsgroup, and in our own source code. If they maintain any exising software, they will likely encounter too. It doesn't seem to be a subject that can be ignored. Also, I think it premature to say that {}-formatting has been proven in battle. AFAICT, there has been very little uptake. I've personally made an effort to use {}-formatting more often but find that I frequently have to lookup the syntax and need to experiment with the interactive interpreter to get it right. I haven't found it easy to teach or to get other people to convert. This is especially true if the person has encountered %-formatting in other languages (it is a popular approach). Raymond

James Y Knight <foom <at> fuhm.net> writes:
Why not allow logging.Formatter to take a callable, which would in turn call the callable with keyword arguments? Therefore, you could write: logging.Formatter("{asctime} - {name} - {level} - {msg}".format) and then: logging.critical(name="Python", msg="Buildbots are down") All this without having to learn about a separate "compatibility wrapper object". Regards Antoine.

On Sep 30, 2009, at 1:01 PM, Antoine Pitrou wrote:
This is a very interesting idea. Note that one of the reasons to /at least/ support {}-strings also is that %-strings are simply too error prone in many situations. For example, if I decide to support internationalization of log format strings, and all I can use is %-strings, it's almost guaranteed that I will have bugs because a translator forgot the trailing 's'. This exactly the motivation that led to PEP 292 $-strings. In fact, while we're at it, it would be kind of cool if I could use $- strings in log templates. Antoine's idea of accepting a callable might fit that bill nicely. -Barry

Barry Warsaw <barry <at> python.org> writes:
You're already covered if you use the PercentMessage/BraceMessage approach I mentioned elsewhere in this thread. Suppose: #Just typing this in, it's not tested or anything class DollarMessage: def __init__(self, fmt, *args, **kwargs): self.fmt = fmt self.args = args self.kwargs = kwargs def __str__(self): return string.Template(self.fmt).substitute(*args, **kwargs)

Vinay Sajip <vinay_sajip <at> yahoo.co.uk> writes:
Whoops, sorry, pressed the "post" button by accident on my previous post. The above substitute call should of course say string.Template(self.fmt).substitute(*self.args, **self.kwargs) and you can alias DollarMessage (or whatever name you choose) as _ or __, say. As far as the Formatter formatting goes, it's easy enough to subclass Formatter to format using whatever approach you want. Regards, Vinay Sajip

Antoine Pitrou <solipsis <at> pitrou.net> writes:
This seems perhaps usable for a Formatter instantiation (infrequent) but a problem for the case where you want to convert format_str + args -> message (potentially frequent, and less readable). Another problem is that logging calls already use keyword arguments (extra, exc_info) and so backward compatibility might be compromised. It also feels like passing a callable could encourage patterns of usage which restrict our flexibility for future changes: we want for now to just allow choosing between % and {}, but a callable can do anything. That's more flexible, to be sure, but more specialized formatting requirements are already catered for using e.g. the PercentMessage/BraceMessage approach. Regards, Vinay Sajip

Vinay Sajip <vinay_sajip <at> yahoo.co.uk> writes:
Why is it a problem? I don't understand. It certainly is less pleasant to write "{foo}".format or "{0} {1}".format than it is to write "{0} {1}" alone, but it's still prettier and easier to remember than the special wrappers people are proposing here.
Then logging can just keep recognizing those special keyword arguments, and forward the others to the formatting function.
It also feels like passing a callable could encourage patterns of usage which restrict our flexibility for future changes:
Which future changes are you thinking about? AFAIK, there hasn't been a single change in logging output formatting in years. Rejecting a present change on the basis that it "restricts our flexibility for future changes" sounds like the worst kind of argument to me :-)
Except that having to wrap format strings with "PercentMessage" or "BraceMessage" is horrible. Python is not Java. Regards Antoine.

Antoine Pitrou <solipsis <at> pitrou.net> writes:
Well, it's less readable, as I said in parentheses. It would work, of course. And the special wrappers needn't be too intrusive: __ = BraceMessage logger.debug(__("Message with {0} {1}", 1, "argument"))
Then logging can just keep recognizing those special keyword arguments, and forward the others to the formatting function.
It just means that you can't pass those values through, and what if some of them are used somewhere in existing code?
It's the Rumsfeldian "We don't know what we don't know" ;-)
Now don't get upset and take it as a rejection, as we're still in the kicking-ideas-around stage ;-) I'm just saying how it feels to me. I agree that logging output formatting hasn't changed in years, and that's because there's been no particular need for it to change (some changes *were* made in the very early days to support a single dict argument). Now that time for change has perhaps come. I'm just trying to think ahead, and can't claim to have got a definitive answer up my sleeve. Passing a callable has upsides and downsides, and ISTM it's always worth focusing on the downsides to make sure they don't come back and bite you later. I don't foresee any specific problem - I'm just uneasy about it.
Except that having to wrap format strings with "PercentMessage" or "BraceMessage" is horrible. Python is not Java.
Amen. I'd say "Yeccchh!" too, if it literally had to be like that. And I also note that there are voices here saying that support for %-formatting shouldn't, or doesn't need to, change, at least until Python 4.0. So consider the following tentative suggestion, which is off the top of my head and offered as a discussion point: Suppose that if you want to use %-formatting, everything stays as is. No backward-compatibility headaches. To support {}-formatting, add an extra class which I've called BraceMessage. Consider this name a "working title", as no doubt a better name will suggest itself, but for now this name makes it clear what we're talking about. If any module wants to use {} formatting for their logging, they can add the line from logging import BraceMessage as __ I've used two underscores, since _ might be being used for gettext, but obviously the importer can use whatever name they want. and then they can use logger.debug(__("The {0} is {1}", "answer", 42)) which I think is more readable than putting in ".format" following the string literal. It's not a *huge* point, perhaps, but "Readability counts". This has the side benefit that if e.g. Barry wanted to use string.Template for formatting, he's just got to replace the above import with something like from logging import DollarMessage as __ Another "working title", please note. And while I've shown these classes being imported from logging, it doesn't make sense to put them there if this idea were to fly in a more general context. Then, perhaps string would be a better home for these classes. Regards, Vinay Sajip

Hello,
Ah, I hadn't thought about that. It looks a bit less awful indeed. I'm of the opinion, however, that %-formatting should remain the default and shouldn't need a wrapper. There's another possibility, which is to build the wrapping directly around the logger. That is, if I want a %-style logger, I do: logger = logging.getLogger("smtp") logger.debug("incoming email from %s", sender_address) and I want a {}-style logger, I do: logger = logging.getLogger("smtp", style="{}") logger.debug("incoming email from {addr}", addr=sender_address) (of course, different users of the "smtp" logger can request different formatting styles when calling getLogger().) We could combine the various proposals to give users flexible APIs. Of course, it generally smells of "there's more than one way to do it".
It's the Rumsfeldian "We don't know what we don't know"
Is this guy in the Python community? :-)
I'm just trying to think ahead, and can't claim to have got a definitive answer up my sleeve.
Sure, we have some time until 2.7/3.2 anyway. Regards Antoine.

Antoine Pitrou <solipsis <at> pitrou.net> writes:
There's a LoggerAdapter class already in the system which is used to wrap loggers so that additional contextual information (e.g. network or database connection information) can be added to logs. The LoggerAdapter could fulfill this "wrapping" function.
We could combine the various proposals to give users flexible APIs. Of course, it generally smells of "there's more than one way to do it".
Yeah, that bothers me too.
It's the Rumsfeldian "We don't know what we don't know"
Is this guy in the Python community?
Not sure, but I believe he's a piece of work and not a guy to get on the wrong side of ;-) Regards, Vinay Sajip

2009/10/1 Vinay Sajip <vinay_sajip@yahoo.co.uk>:
This seems to me to be almost the same as the previous suggestion of having a string subclass: class BraceFormatter(str): def __mod__(self, other): # Needs more magic here to cope with dict argument return self.format(*other) __ = BraceFormatter logger.debug(__("The {0} is {1}"), "answer", 42) The only real differences are 1. The positioning of the closing parenthesis 2. The internal implementation of logger.debug needs to preserve string subclasses properly But the benefit is that the approach allows anyone to use brace formatting in any API that currently accepts % format (assuming string subclasses don't get mangled). On the one hand, I'd prefer a more general solution. On the other, I'm nervous about that "assuming string subclasses..." proviso. I've no real answer, just offering the point up for consideration. Paul.

Paul Moore <p.f.moore <at> gmail.com> writes:
The other difference is that my suggestion supports Barry's desire to use string.Template with no muss, no fuss ;-) Plus, very little additional work is required compared to your items 1 and 2. ISTM BraceMessage would be something like this, clsss BraceMessage: def __init__(self, fmt, *args, **kwargs): self.fmt = fmt self.args = args self.kwargs = kwargs def __str__(self): return self.fmt.format(*self.args, **self.kwargs) Regards, Vinay

On Thu, Oct 1, 2009 at 06:29, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
So I created this last night: import collections class braces_fmt(str): def __mod__(self, stuff): if isinstance(stuff, tuple): return self.__class__(self.format(*stuff)) elif isinstance(stuff, collections.Mapping): return self.__class__(self.format(**stuff)) else: return self.__class__(self.format(stuff)) The biggest issue is that ``"%s" % {'a': 42}`` substitutes the dict instead of throwing an error that str.format() would do with the code above. But what's nice about this is I think I can use this now w/ any library that expects % interpolation and it should basically work.
I don't think Paul's suggestion requires much more work to support string.Template, simply a subclass that implements __mod__>
I guess my question is what's the point of the class if you are simply converting it before you pass it in to the logger? To be lazy about the formatting call? Otherwise you could simply call str.format() with your arguments before you pass the string into the logger and not have to wrap anything. -Brett

So I created this last night:
So there's no need to change modules like logging to explicitly provide support for {}-formatting? What's not to like? ;-) Something like this perhaps should have been added in at the same time as str.format went in.
I don't think Paul's suggestion requires much more work to support string.Template, simply a subclass that implements __mod__
True.
That's exactly the reason - to defer the formatting until it's needed. Otherwise you can always format the string yourself,as you say, and pass it as the single argument in the logging call - logging won't know or care if it was passed in as a literal, or was computed by %-, {}-, $- or any other formatting approach. Regards, Vinay Sajip

Vinay Sajip wrote:
I believe classes like fmt_braces/fmt_dollar/fmt_percent will be part of a solution, but they aren't a complete solution on their own. (Naming the three major string formatting techniques by the key symbols involved is a really good idea though) The two major problems with them: 1. It's easy to inadvertently convert them back to normal strings. If a formatting API even calls "str" on the format string then we end up with a problem (and switching to containment instead of inheritance doesn't really help, since all objects implement __str__). 2. They don't help with APIs that expect a percent-formatted string and do more with it than just pass it to str.__mod__ (e.g. inspecting it for particular values such as '%(asctime)s') Still, it's worth considering adding the three fmt_* classes to the string module to see how far they can get us in adapting the formats for different APIs. Note that I don't think these concepts are fully baked yet, so we shouldn't do anything in a hurry - and anything that does happen should be via a PEP so we can flush out more issues. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Oct 1, 2009, at 5:54 PM, Nick Coghlan wrote:
Using containment instead of inheritance makes sure none of the *other* operations people do on strings will appear to work, at least (substring, contains, etc). I bet explicitly calling str() on a format string is even more rare than attempting to do those things.
True, but I don't think there's many such cases in the first place, and such places can be fixed to not do that as they're found. Until they are fixed, fmt_braces will loudly fail when used with that API (assuming fmt_braces is not a subclass of str). James

James Y Knight <foom <at> fuhm.net> writes:
Actually, logging calls str() on the object passed as the first argument in a logging call such as logger.debug(), which can either be a format string or an arbitrary object whose __str__() returns the format string. Regards, Vinay Sajip

Nick Coghlan <ncoghlan <at> gmail.com> writes:
Good point as far as the general case is concerned, though it's perhaps not that critical for logging. By which I mean, it's not unreasonable for Formatter.__init__ to grow a "style" keyword parameter which determines whether it uses %-, {}- or $-formatting. Then the formatter can look for '%(asctime)s', '{asctime}' or '$asctime' according to the style. Just to clarify - LogRecord.getMessage *will* call str() on a message object if it's not a string or Unicode object. For 2.x the logic is if type(msg) not in (unicode, str): msg = str(msg) and for 3.x the check is for isinstance(msg, str).
Yes, we're just "kicking the tires" on the various ideas. There are things still a bit up in the air such as what happens when pickling and sending to an older version of Python, etc. which still need to be resolved for logging, at least. Regards, Vinay Sajip

Vinay Sajip wrote:
It's tangential, but in the str.format case you don't want to check for just '{asctime}', because you might want '{asctime:%Y-%m-%d}', for example. But there are ways to delay computing the time until you're sure it's actually being used in the format string, without parsing the format string. Now that I think of it, the same technique could be used with %-formatting: import datetime class DelayedStr: def __init__(self, fn): self.fn = fn self.obj = None def __str__(self): if self.obj is None: self.obj = self.fn() return self.obj.__str__() def current_time(): print "calculating time" return datetime.datetime.now() # will not compute current time print '%(msg)s' % {'asctime':DelayedStr(current_time), 'msg':'test'} # will compute current time: same dict used as before print '%(asctime)s %(msg)s' % {'asctime':DelayedStr(current_time), 'msg':'test'} Eric.

2009/10/1 Eric Smith <eric@trueblade.com>:
Still tangential, but it seems to me that this discussion has exposed a couple of areas where the logging interface is less than ideal: - The introspection of the format string to delay computing certain items (Eric's suggestion may be an improvement here). - The "call str() on any non-string object to get a format string" API (which precludes string subclasses). I suspect other APIs will exist with similar issues once the whole question of supporting multiple format syntaxes gets wider publicity... Paul.

Paul Moore wrote:
Calling str on non-string objects to get a format string does not (prima-facie) preclude string subclasses:
Michael
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog

Paul Moore <p.f.moore <at> gmail.com> writes:
Yes, but that's an implementation detail and not part of the logging interface. It can be changed without any particular additional impact on user code - when I say "additional" I mean apart from the need to change the format strings to {} format, which they would have to do anyway at some point.
- The "call str() on any non-string object to get a format string" API (which precludes string subclasses).
It doesn't preclude string subclasses: it just calls str() on an arbitrary message object to get the string representation for that object. The return value is used to interpolate into the formatted output, and that's all. So I don't understand what's being precluded and how - please elaborate. Thanks & regards, Vinay Sajip

On Thu, Oct 1, 2009 at 14:54, Nick Coghlan <ncoghlan@gmail.com> wrote:
I agree. I view them more as a band-aid over APIs that only accept % formatting but the user of the library wants to use {} formatting.
Well, you can override the methods on str to always return the proper thing, e.g. ``def __str__(self): return self``. Do the same for __add__() and all other methods on strings that return a string themselves. It should be possible to prevent Python code from stripping off the class.
Nope, they don't and people would need to be warned against this.
Having a PEP that lays out how we think people should consider transitioning their code would be good. -Brett

On Thu, Oct 1, 2009 at 11:03 AM, Brett Cannon <brett@python.org> wrote:
I see how this could allow a user to supply a {}-format string to an API that accepts only %-format strings. But I still don't see the transition strategy for the API itself. That is, how does the %-format API use this to eventually switch to {}-format strings? Could someone please lay it out for me, step by step, showing what happens in each version? Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

On Oct 1, 2009, at 6:19 PM, Steven Bethard wrote:
Here's what I said in my first message, suggesting this change. Copy&pasted below: I wrote:
So do (1) in 3.2. Then do (2) in 3.4, and (3) in 3.6. I skipped two versions each time because of how widely this API is used, and the likely pain that doing the transition quickly would cause. But I guess you *could* do it in one version each step. James

On Thu, Oct 1, 2009 at 15:19, Steven Bethard <steven.bethard@gmail.com> wrote:
First off, a wrapper like this I think is a temporary solution for libraries that do not have any transition strategy, not a replacement for one that is thought out (e.g. using a flag when appropriate). With that said, you could transition by: 1. Nothing changes as hopefully the wrapper works fine (as people are pointing out, though, my approach needs to override __str__() to return 'self', else the str type will just return what it has internally in its buffer). 2. Raise a deprecation warning when ``isinstance(ob, brace_fmt)`` is false. When a class is passed in that is a subclass of brace_fmt, call ob.format() on it. 3. Require the subclass. 4. Remove the requirement and always call ob.format(). -Brett

On Thu, Oct 1, 2009 at 4:35 PM, Brett Cannon <brett@python.org> wrote:
Thanks Brett, that's clear. So you save one version over the proposal of adding a format= flag to the API. On Thu, Oct 1, 2009 at 4:13 PM, James Y Knight <foom@fuhm.net> wrote:
I didn't understand how you wanted to apply your suggestion to an API (instead of str.__mod__) the first time and I still don't understand it. Is what Brett has proposed the same thing? Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

Vinay Sajip wrote:
It's also difficult for the subclass to prevent this without creating an infinite loop... (I only spent about 10 minutes looking into it the other day, but that's what happened in all of my naive attempts at doing it in pure Python code). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Oct 1, 2009, at 9:11 AM, Paul Moore wrote:
I'd rather make that: class BraceFormatter: def __init__(self, s): self.s = s def __mod__(self, other): # Needs more magic here to cope with dict argument return s.format(*other) __ = BraceFormatter That is, *not* a string subclass. Then if someone attempts to mangle it, or use it for anything but %, it fails loudly. James

Raymond Hettinger <python <at> rcn.com> writes:
It looks like the BraceMessage would have to re-instantiate on every invocation.
True, because the arguments to the instantiation are kept around as a BraceMessage instance until the time comes to actually format the message (which might be never). Since typically in performance-sensitive code, the isEnabledFor pattern is used to avoid doing unnecessary work, as in if logger.isEnabledFor(logging.DEBUG): logger.debug(__("The {0} is {1}", "answer", 42)) The BraceMessage invocation overhead is only incurred when needed, as is the cost of computing the additional arguments. As I understand it {}-formatting is slower than %-formatting anyway, and if this pattern is used only for {}-formatting, then there will be no additional overhead for %-formatting and some additional overhead for {}-formatting. I'm not sure what that instantiation cost will be relative to the overall time for an "average" call - whatever that is ;-) - though. Other approaches to avoid instantiation could be considered: for example, making __ a callable which remembers previous calls and caches instances keyed by the call arguments. But this will incur memory overhead and some processing overhead and I'm not sure if it really buys you enough to warrant doing it. Regards, Vinay Sajip

On Sep 30, 2009, at 1:01 PM, Antoine Pitrou wrote:
It's a nice idea -- but I think it's better for the wrapper (whatever form it takes) to support __mod__ so that logging.Formatter (and everything else) doesn't need to be modified to be able to know about how to use both callables and "%"ables. Is it possible for a C function like str.format to have other methods defined on its function type? James

Martin v. Löwis wrote:
It's a maintenance burden. There are several outstanding bugs with it, admittedly not of any great significance. I've been putting time into fixing at least one of them. When Mark and I did short-float-repr, at least half of my time was consumed with %-formatting, mostly because of how it does memory management. On the plus side, %-formatting is (and always will be) faster than str.format(). Its very limitations make it possible for it to be fast. I'd note that PEP 3101 calls str.format() a replacement for %-formatting, not an alternate mechanism to achieve the same end.
Having a converter and guessing are 2 distinct issues. I believe a convert from %-formatting specification strings to str.format() strings is possible. I point this out not because I think a converter solves this problem, but because in the past there's been a debate on whether a converter is even possible. Eric.

Well - that's the cost of keeping it in the language. It's not a problem with using it while it *is* in the language. So if a decision was made to eventually remove % formatting, it would be reasonable to start migrating code to PEP 3101. However, no such decision has been made (and hopefully won't be throughout 3.x), so as the mechanism *is* available, there is no need to start changing existing code (except the for actual issue Steven discusses, namely libraries that expect strings in % template form).
I'd note that PEP 3101 calls str.format() a replacement for %-formatting, not an alternate mechanism to achieve the same end.
I think this is a mis-wording; the intent of the PEP apparently is to propose this mechanism as an option, not as an actual replacement. This becomes clear when reading the "Backwards Compatibility" section: # Backwards compatibility can be maintained by leaving the existing # mechanisms in place. The new system does not collide with any of # the method names of the existing string formatting techniques, so # both systems can co-exist until it comes time to deprecate the # older system. Regards, Martin

On Wed, Sep 30, 2009 at 9:48 AM, "Martin v. Löwis" <martin@v.loewis.de> wrote:
The problem is, PEP 3101 and our interpretation of it evolved. The original proposal for {}-formatting was certainly put forward with the aim to completely *replace* %-formatting, and care was taken in the design to cover all use cases, avoid known problems, etc. Then we started looking seriously at conversion from Python 2 to Python 3 and we discovered that converting %-formatting to {}-formatting was a huge can of worms, and decided it wasn't worth to try and do *at the time* given the Python 3 schedule. We considered some kind of gentle deprecation warning, but decided that even that would be too noisy. So now we have two competing mechanisms. In the long run, say Python 4, I think we don't need both, and we should get rid of one. My preference is still getting rid of %-formatting, due to the problems with it that prompted the design of {}-formatting (no need to reiterate the list here). So how do we get there? My proposal would be to let this be a gradual take-over of a new, superior species in the same niche as an older species. (Say, modern man over Neanderthal man.) Thus, as new code is written (especially example code, which will be copied widely), we should start using {}-formatting, and when new APIs are designed that tie in to some kind of formatting, they should use {}-formatting. Adding support for {}-formatting, in addition to %-formatting, to existing APIs like the logging package also strikes me as a good idea, as long as backwards compatibility can be preserved. (I have no strong ideas on how to do this right now.) If we do this right, by the time Python 4 comes around, {}-formatting will have won the race, and there won't be a question about removing %-formatting at the time. I wouldn't be surprised if by then static analysis techniques will have improved so that we *can* consider automatic conversion by then. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Eric Smith wrote:
I agree with Martin. Both approaches have their ups and downs, but forcing users to move from %-formatting to .format()-formatting will just frustrate them: having to convert several thousand such (working) uses in their code with absolutely no benefit simply doesn't look like a good way to spend your time. In addition to the code changes, such a move would also render existing translations of the %-formatted string templates useless.
Why not allow both and use .format() for those cases where %-formatting doesn't work too well ?
I think that's a wording we should change. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 01 2009)
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On 1 Oct 2009, at 10:37, M.-A. Lemburg wrote:
I agree you cannot force the move to {} format. There are programs that expose the %(name)s in user interfaces for customisation.
In addition to the code changes, such a move would also render existing translations of the %-formatted string templates useless.
Speaking of translation support has xgettext been updated to support {}? It is a life saver to have xgettext report that "This %s and %s" is not translatable. Barry

On Sep 29, 2009, at 11:15 PM, Martin v. Löwis wrote:
Although I hate the name 'dicttemplate', this seems like the best solution to me. Maybe it's good that 'dicttemplate' is so ugly though so that people will naturally prefer 'format' :). But I like this because there's not really any magic, it's explicit, and the decision is made by the coder at the call site. The implementation does not need to guess at all. If this is adopted, it should become a common idiom across Python so that once you've learned how to transition between the format strings, you pretty much know how to do it for any supporting API. So we should adopt it across all of the standard library. -Barry

On Wed, Sep 30, 2009 at 5:21 AM, Barry Warsaw <barry@python.org> wrote:
Could you comment on what you think we should do when the parameter is not positional? As I mentioned upthread, in the case of logging.Formatter, it's already documented as taking the keyword parameter "fmt", so we'd have to use the name "fmt" for % formatting. Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

On Sep 30, 2009, at 11:22 AM, Steven Bethard wrote:
I'm okay with fmt==%-formatting and format=={}-formatting, but I'd also be okay with transitioning 'fmt' to 'dicttemplate' or whatever. I think the important thing is to be explicit in the method signature which one you want (secondary would be trying to standardize this across the stdlib). -Barry

On Wed, Sep 30, 2009 at 8:31 AM, Barry Warsaw <barry@python.org> wrote:
Thanks for the clarification. I generally like this approach, though it's not so convenient for argparse which already takes format strings like this:: parser = ArgumentParser(usage='%(prog)s [--foo]') parser.add_argument( '--foo', type=int, default=42, help='A foo of type %(type)s, defaulting to %(42)s) That is, existing keyword arguments that already have good names (and are pretty much always used as keyword arguments) take format strings. I'm not sure that changing the name of usage= or help= here is really an option. I guess in this case I'm stuck with something like Benjamin's suggestion of adding an additional flag to control which type of formatting, and the corresponding 4 versions of cleanup. Ew. Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

On Sep 30, 2009, at 11:39 AM, Steven Bethard wrote:
Ah right.
I missed Benjamin's suggestion, but in this case I would say add a flag to ArgumentParser. I'm either going to want {} formatting all or nothing. E.g. import argparse parser = ArgumentParser(usage='{prog} [--foo]', format=argparse.BRACES) parser.add_argument( '--foo', type=int, default=42, help='A foo of type {type}, defaulting to {42}') (although that last looks weird ;). -Barry

On Wed, Sep 30, 2009 at 8:50 AM, Barry Warsaw <barry@python.org> wrote:
Yep, sorry, typo, that should have been %(default)s, not %(42)s.
Yeah, that's basically Benjamin's suggestion, with the transition path being: (1) Introduce format= keyword argument, defaulting to PERCENTS (2) Deprecate format=PERCENTS (3) Error on format=PERCENTS (Benjamin suggested just changing the default here, but that would give a release where errors would pass silently) (4) Deprecate format= keyword argument. (5) Remove format= keyword argument. It's a little sad that it takes 5 versions to do this, but I guess if a user is on top of things, at version (1) they add format=BRACES to all their code, and then remove those at version (4). So even though there are 5 versions, there are only two code changes required. At least in the case of argparse, this can be a constructor argument as you suggest, and we only have to introduce this flag in one place. Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

Unless there is a firm decision to switch to kill %-formatting across the board, I don't think anything should be done at all. Creating Py3.x was all about removing cruft and clutter. I don't think it would be improved by adding two ways to do it for everything in the standard library. That is a lot of additional code, API expansion, and new testing, fatter docs, and extra maintenance, but giving us no new functionality. Anytime we start hearing about newstyle/oldstyle combinations, I think a flag should go up. Anytime there is a proposal to make sweeping additions that do not add new capabilities, a flag should go up. I understand the desire to have all formatting support both ways, but I don't think it is worth the costs. People *never* need both ways though they may have differing preferences about which *one* to use. my-two-cents, Raymond

Unfortunately, as Steven pointed out, the parameter is *already* documented with the name "fmt". So one option would be to call it "fmt" and "format"; the other option would be to not only deprecate the positional passing, but also the passing under the name fmt=. As for calling it "dicttemplate" - I'm sure people can and will propose alternative spellings :-) Regards, Martin

Steven Bethard <steven.bethard <at> gmail.com> writes:
In logging at least, there are two different places where the formatting issue crops up. The first is creating the "message" part of the the logging event, which is made up of a format string and arguments. The second is the one Steven's mentioned: formatting the message along with other event data such as time of occurrence, level, logger name etc. into the final text which is output. Support for both % and {} forms in logging would need to be considered in these two places. I sort of liked Martin's proposal about using different keyword arguments, but apart from the ugliness of "dicttemplate" and the fact that "fmt" is already used in Formatter.__init__ as a keyword argument, it's possible that two different keyword arguments "fmt" and "format" both referring to format strings might be confusing to some users. Benjamin's suggestion of providing a flag to Formatter seems slightly better, as it doesn't change what existing positional or keyword parameters do, and just adds an additional, optional parameter which can start off with a default of False and transition to a default of True. However, AFAICT these approaches only cover the second area where formatting options are chosen - not the creation of the message from the parameters passed to the logging call itself. Of course one can pass arbitrary objects as messages which contain their own formatting logic. This has been possible since the very first release but I'm not sure that it's widely used, as it's usually easier to pass strings. So instead of passing a string and arguments such as logger.debug("The %s is %d", "answer", 42) one can currently pass, for a fictitious class PercentMessage, logger.debug(PercentMessage("The %s is %d", "answer", 42)) and when the time comes to obtain the formatted message, LogRecord.getMessage calls str() on the PercentMessage instance, whose __str__ will use %-formatting to get the actual message. Of course, one can also do for example logger.debug(BraceMessage("The {} is {}", "answer", 42)) where the __str__() method on the BraceMessage will do {} formatting. Of course, I'm not suggesting we actually use the names PercentMessage and BraceMessage, I've just used them there for clarity. Also, although Raymond has pointed out that it seems likely that no one ever needs *both* types of format string, what about the case where application A depends on libraries B and C, and they don't all share the same preferences regarding which format style to use? ISTM no-one's brought this up yet, but it seems to me like a real issue. It would certainly appear to preclude any approach that configured a logging-wide or logger-wide flag to determine how to interpret the format string. Another potential issue is where logging events are pickled and sent over sockets to be finally formatted and output on different machines. What if a sending machine has a recent version of Python, which supports {} formatting, but a receiving machine doesn't? It seems that at the very least, it would require a change to SocketHandler and DatagramHandler to format the "message" part into the LogRecord before pickling and sending. While making this change is simple, it represents a potential backwards-incompatible problem for users who have defined their own handlers for doing something similar. Apart from thinking through the above issues, the actual formatting only happens in two locations - LogRecord.getMessage and Formatter.format - so making the code do either %- or {} formatting would be simple, as long as it knows which of % and {} to pick. Does it seems too onerous to expect people to pass an additional "use_format" keyword argument with every logging call to indicate how to interpret the message format string? Or does the PercentMessage/BraceMessage type approach have any mileage? What do y'all think? Regards, Vinay Sajip

On Wed, Sep 30, 2009 at 16:03, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
I personally prefer the keyword argument approach to act as a flag, but that's me. As for the PercentMessage/BraceMessage, I would make sure that you just simply take the string format and simply apply the arguments later to cut down on the amount of parentheses butting up against each other: ``logger.debug(BraceMessage("The {} is {}"), "answer", 42)``. It's still an acceptable solution that provides a clear transition: simply provide the two classes, deprecate PercentMessage or bare string usage, require BraceMessage, remove requirement. This wrapper approach also provides a way for libraries that have not shifted over to still work with PEP 3101 strings by letting the user wrap the string to be interpolated themselves and then to pass it in to the libraries. It's just unfortunate that any transition would have this cost of wrapping all strings for a while. I suspect most people will simply import the wrapping class and give it some short name like people do with gettext. -Brett

Brett Cannon <brett <at> python.org> writes:
The problem with that is that BraceMessage.__str__() wouldn't know what arguments to use to produce the message.
Yes, logger.debug(__("The {} is {}", "answer", 42)) isn't ideal but perhaps liveable with. And hopefully with a decent editor, the paren-butting annoyance will be minimized. Regards, Vinay Sajip

Antoine Pitrou wrote:
As someone who likes .format() and who already uses such bound methods to print, such as in emsg = "...".format ... if c: print(emsg(arg, barg)) I find this **MUCH** preferable to the ugly and seemingly unnecessary wrapper class idea being bandied about. This would be scarcely worse than passing the string itself. Terry Jan Reedy

On Thu, Oct 1, 2009 at 10:49 PM, Terry Reedy <tjreedy@udel.edu> wrote:
But it's not much of a transition plan. Or are you suggesting: (1) Make API accept callables (2) Issue warnings for regular strings (3) Throw exceptions for regular strings (4) Allow regular strings again, but assume {}-style formatting ? Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

Steven Bethard <steven.bethard <at> gmail.com> writes:
But it's not much of a transition plan. Or are you suggesting:
The question is why we want a transition plan that will bother everyone with no tangible benefits for the user. Regards Antoine.

Has anyone considered the idea of having the string % operator behave intelligently according to the contents of the format string? If it contains one or more valid %-formats, use old-style formatting; if it contains one or more valid {}-formats, use new-style formatting. Ambiguous cases could arise, of course, but hopefully they will be fairly rare, and raising an exception would point out the problem and allow it to be fixed. -- Greg

On Fri, Oct 2, 2009 at 6:29 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Hm... The % operator already does too much guessing: if the string contains exactly one %-format, the argument may be either a size-1 tuple or a non-tuple, otherwise it has to be a size-N tuple, except if the %-formats use the %(name)X form, then the argument must always be a dict. It doesn't sound to me as if adding more guesswork is going to improve its reliability. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

On Fri, Oct 2, 2009 at 2:34 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
I think Guido expressed my feelings pretty well: On Wed, Sep 30, 2009 at 10:37 AM, Guido van Rossum <guido@python.org> wrote:
I agree with this 100% but I can't see it working unless we have some sort of transition plan. Just saying "ok, switch your format strings from % to {}" didn't work in Python 3.0 for various good reasons, and I can't imagine it will work in Python 4.0 unless we have a transition plan. Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

[Steven Bethard]
Do the users get any say in this? I imagine that some people are heavily invested in %-formatting. Because there has been limited uptake on {}-formatting (afaict), we still have limited experience with knowing that it is actually better, less error-prone, easier to learn/rember, etc. Outside a handful of people on this list, I have yet to see anyone adopt it as the preferred syntax. Raymond

Raymond Hettinger <python <at> rcn.com> writes:
It is known to be quite slower. The following timings are on the py3k branch: - with positional arguments: $ ./python -m timeit -s "s='%s %s'; t = ('hello', 'world')" "s % t" 1000000 loops, best of 3: 0.313 usec per loop $ ./python -m timeit -s "f='{} {}'.format; t = ('hello', 'world')" "f(*t)" 1000000 loops, best of 3: 0.572 usec per loop - with named arguments: $ ./python -m timeit -s "s='%(a)s %(b)s'; d = dict(a='hello', b='world')" "s % d" 1000000 loops, best of 3: 0.387 usec per loop $ ./python -m timeit -s "f='{a} {b}'.format; d = dict(a='hello', b='world')" "f(**d)" 1000000 loops, best of 3: 0.581 usec per loop Regards Antoine.

"Raymond Hettinger" <python@rcn.com> writes:
I'm a user! :-) I hate calling methods on string literals, I think it looks very odd to have code like this: "Displaying {0} of {1} revisions".format(x, y) Will we be able to write this as "Displaying {0} of {1} revisions" % (x, y) too?
I've skimmed over the PEP, and the new {}-syntax seems to have some nice features. But I've not seen it used anywhere yet. -- Martin Geisler VIFF (Virtual Ideal Functionality Framework) brings easy and efficient SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.

<delurk> Rami Chowdhury posted this to a mailing list; I've been using it (perhaps unintentionally promoting it) as part of non-English, non-ASCII font outreach:
As a user, my assumption was {} was going forward, rain or shine, and everyone should be on board by Python 3.2. (I thought once the Talin PEP got approved, that was it). I wrote Steven Bethard privately about this. Sorry for the intrusion. </delurk>

What about using string prefix 'f'? f"{foo} and {bar}" % something == "{foo} and {bar}.format(something) s = f"{foo}" t = "%(bar)s" s + t # raises Exception Transition plan: n: Just add F prefix. And adding "format_string" in future. n+1: deprecate __mod__() without 'F'. n+2: libraries use .format() and deprecate __mod__() with 'F' n+3: remove __mod__() -- Naoki INADA <songofacandy@gmail.com>

Carl Trachte wrote:
I've skimmed over the PEP, and the new {}-syntax seems to have some nice features. But I've not seen it used anywhere yet.
I am using it with 3.1 in an unreleased book I am still writing, and will in any code I publish.
Autonumbering, added in 3.1, makes '{}' as easy to write for simple cases as '%s'. That was one complaint about the original 3.0 version. Another was and still is the lack of conversion, which is being worked on. (I thought once the Talin
tjr

On Fri, Oct 2, 2009 at 12:43 PM, Martin Geisler <mg@lazybytes.net> wrote:
I doubt it. One of the major complaints about the %-style formatting was that the use of % produced (somewhat) unexpected errors because of how operator precedence works::
Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

Steven Bethard wrote:
The other major problem with the use of the mod operator is the bugs encountered with "fmt % obj" when obj happened to be a tuple or a dict. So no, the switch to a method rather than an operator was deliberate. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Fri, Oct 2, 2009 at 11:56 AM, Raymond Hettinger <python@rcn.com> wrote:
Sure, I guess this is a possibility too, and it could make the transition process I have to work through for argparse much easier. ;-) To be clear, are you suggesting that APIs that currently support only %-formatting shouldn't bother supporting {}-formatting at all? Or are you suggesting that they should support both, but support for %-formatting should never go away? Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus

Raymond Hettinger wrote:
A self-fulfilling prophecy if ever I heard one... uptake is limited because there's a large legacy code base that doesn't use it and many APIs don't support it, so we shouldn't bother trying to increase the number of APIs that *do* support it? I'm starting to think that a converter between the two format mini-languages may be the way to go though. fmt_braces is meant to provide a superset of the capabilites of fmt_percent, so a forward converter shouldn't be too hard. A reverse converter may have to punt with ValueError when it finds things that cannot be expressed in the fmt_percent mini language though. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Nick Coghlan <ncoghlan <at> gmail.com> writes:
I've done a first cut of a forward (% -> {}) converter: http://gist.github.com/200936 but I'm not sure there's a case for a converter in the reverse direction, if we're encouraging movement in one particular direction. Regards, Vinay Sajip

Vinay Sajip wrote:
It would allow an application to still use brace formatting throughout even if one particularly library only accepted percent formatting. Probably not worth the effort at this point though, as if we can get a reliable forward converter happening then it may become possible for APIs to correctly guess which kind of format string they have been passed. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Oct 2, 2009, at 2:56 PM, Raymond Hettinger wrote:
Well, I actually think it was a pretty bad idea to introduce {} formatting, because %-formatting is well-known in many other languages, and $-formatting is used by basically all the rest. So the introduction of {}-formatting has always seemed silly to me, and I wish it had not happened. HOWEVER, much worse than having a new, different, and strange formatting convention is having *multiple* formatting conventions arbitrarily used in different places within the language, with no rhyme or reason. So, given that brace-formatting was added, and that it's been declared the way forward, I'd *greatly* prefer it taking over everywhere in python, instead of having to use a mixture. James

That doesn't mean we have to have a transition plan *now*. Creating one after Python 3.5 is released (i.e. in 2015 or so) might be sufficient. To create a transition plan, you first need *consensus* that you actually do want to transition. I don't think such consensus is available, and might not be available for a few more years. Pushing the issue probably delays obtaining consensus. Regards, Martin

Martin v. Löwis wrote:
Agreed, but that doesn't rule out discussions of what can be done to make such a transition easier. And just as 2to3 makes the overall Python transition practical, a percent to brace format translator should make an eventual formatting transition feasible. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Steven Bethard wrote:
It is a 'plan' to transition from not being able to use the new formatting, which I prefer, throughout the stdlib, to being able to do so. I believe most, even if not all, find that acceptable. Certainly, I think you should be able to implement the above for argparse before submitting it. And I would hope that 3.2, in a year, is generally .format usable. This is the first step in a possible long-term replacement, but there is currently no consensus to do any more than this. So I think it premature to do any more. I would agree, for instance, that an auto-translation tool is needed. Terry Jan Reedy

2009/10/3 Brett Cannon <brett@python.org>:
I've already started a converter. It's here: https://code.launchpad.net/~gutworth/+junk/mod2format -- Regards, Benjamin

Brett Cannon <brett <at> python.org> writes:
I've done a first cut of a converter from %-format to {}-format strings. I'm not sure where you want to put it in the sandbox, I've created a gist on GitHub: http://gist.github.com/200936 Not thoroughly tested, but runs in interactive mode so you can try things out. All feedback appreciated! Regards, Vinay

Raymond Hettinger <python <at> rcn.com> writes:
We should get one written. ISTM, every %-formatting string is directly translatable to an equivalent {}-formatting string.
I've made a start, but I'm not sure how best to handle the '#' and ' ' conversion flags. Regards, Vinay Sajip

Raymond Hettinger <python <at> rcn.com> writes:
We should get one written. ISTM, every %-formatting string is directly translatable to an equivalent {}-formatting string.
I'm not sure you can always get equivalent output from the formatting, though. For example:
Someone please tell me if there's a better {}-format string which I've missed! Regards, Vinay Sajip

MRAB <python <at> mrabarnett.plus.com> writes:
"{0:#08x}".format(0x1234) '0x001234'
Good call, but here's another case:
"%#o" % 0x1234 '011064'
I don't see how to automatically convert the "%#o" spec, though of course there are ways of fudging it. The obvious conversion doesn't give the same value:
"{0:#o}".format(0x1234) '0o11064'
I couldn't see a backward-compatibility mode for str.format generating just a leading "0" for octal alternative format (the C style) as opposed to "0o". Regards, Vinay Sajip

Vinay Sajip <vinay_sajip <at> yahoo.co.uk> writes:
Apart from the sheer unreadability of the {}-style format string, the result looks rather unexpected from a human being's point of view. (in those situations, I would output the 0x manually anyway, such as:
Regards Antoine.

Antoine Pitrou <solipsis <at> pitrou.net> writes:
Well of course, but I asked the question in the context of providing an *automatic* converter from %-format strings to {}-format. At the moment, it doesn't seem like a 100%-faithful automatic conversion is feasible. Regards, Vinay Sajip

Antoine Pitrou wrote:
"#" formatting was added to int.__format__ in order to support this case:
format(10, '#6x') ' 0xa'
Without '#', there's no way to specify a field width but still have the '0x' up against the digits (at least not without generating an intermediate result and figuring out the width manually). The fact that it works in combination with '0' or '>' (not sure which one makes it unreadable to you) wasn't really the point of the feature. Eric.

Antoine Pitrou wrote:
The percent format string is pretty unreadable too - you're just more used to it, so it doesn't look as weird :) Vinay's problem above is due to using the wrong alignment flag: ">", which says to right align everything, instead of "=", which says to left align the sign character and the numeric prefix with the fill character inserted in the middle. In this particular case he could also use the zero-padding shortcut which leaves out the alignment flag altogether (and implies a "0=" alignment format). That is (using 2.6/3.1):
Adding in the sign bit gives the following translations:
Note that ">" alignment is actually now *broken* on trunk and py3k, since ">" and "=" are now behaving exactly the same instead of the former right aligning the entire number including the sign bit and prefix:
(bug assigned to Eric: http://bugs.python.org/issue7081) Note that, since percent formatting doesn't allow specification of the fill characters or the field alignment, translations should probably rely on the simple field width specifier, optionally selecting zero padding by preceding it with a zero. It should never be necessary to use the full alignment spec for translated formats. The other thing to keep in mind is that brace formatting is fussier about the order of things - items *must* appear in the order they are listed in PEP 3101 (i.e. if wanting a zero padded field with leading sign and numeric prefix, you must write "+#0"). Percent format, on the other hand, allows the "#", "+" and "0" to be placed in any order you like (although they must appear before the field width definition, precision specifier and type code). As far as I can see, that leaves the prefixing of octal numbers ("0o" vs "0") as the only true incompatibility between percent formatting and brace formatting, and even for those the incompatibility is limited to cases where a field width is specified without leading zeroes or a sign character is specified. In other cases, the translation can just stick a leading literal "0" in front of the field in the brace formatting string. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Nick Coghlan <ncoghlan <at> gmail.com> writes:
[snip]
Helpful analysis there, Nick, thanks. Bonzer ;-) There's also the corner case of things like %#.0f which, when asked to format 3e100, will print 3.e+100 whereas the translated format {0:.0f}, will print 3e+100 for the same value. BTW I sent Eric a private mail re. the "0o" versus "0" issue, to see if it was worth raising an enhancement request on the bug tracker using "O" to generate compatible rendering for octals. Regards, Vinay Sajip

Vinay Sajip wrote:
I didn't get your message, could you resend?. I was thinking the same thing, but it seems like a transition step. I'd rather not keep such backward compatibility hacks (if you will) around for the long haul. How about a flag (maybe '*') at the start of the format specification which says "operate in backward compatibility mode"? We could document it as being only useful for the % to {} translator, and promise to remove it at some point in the future. Either actually deprecate it or just promise to deprecate it in the future. Eric.

Eric Smith wrote at Thu, 08 Oct 2009 10:24:33 -0400:
That doesn't seem very useful to me. IIUC, the point of the translator is to allow porting of the millions of existing %-formating operations to the new-style .format. If the result of that is deprecated or removed a few years from now, all maintainers of long existing code have exactly the same problem. IMHO, either the translation is done once and gives identical output or it isn't worth doing at all. -- Christian Tanzer http://www.c-tanzer.at/

Christian Tanzer wrote:
I was thinking of it as a transition step until all application code switched to {} formatting. In which case the application has to deal with it.
IMHO, either the translation is done once and gives identical output or it isn't worth doing at all.
I disagree. I doubt even 0.001% of all format strings involve octal formatting. Is it really worth not providing a transition path if it can't cover this case? Eric.

Benjamin Peterson wrote:
That works so long as the original format string doesn't specify either a space padded field width or else a sign character. For those the extra zero needs to be inserted after the leading characters but before the number, so the formatting engine really has to handle it. I'm actually thinking that having the ability to specify a single 0 as the leading character for octal output is a legitimate feature. There are plenty of other tools out there that use a single leading zero to denote octal numbers (e.g. think of a Python script that generated C code), so having Python be able to produce such numbers makes a lot of sense. Vinay's suggestion of using 'O' instead of 'o' to denote C-style octal formatting instead of Python-style sounds reasonable to me (similar in degree to the upper vs lower case distinction for 'x' and 'X' hex formatting). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Nick Coghlan wrote:
Mark points out in http://bugs.python.org/issue7094 that we'd also need to add alternate float formatting for any automated translation facility to work flawlessly. There might be other float issues involving trailing decimals with no zeros that work differently, too. Eric.

Eric Smith wrote at Thu, 08 Oct 2009 10:55:21 -0400:
You lost me here. All that talk of deprecating %-formatting makes me really nervous. %-formatting is pervasive in all existing Python code. Without an automatic translator that is 100% accurate, porting all that code to {}-formatting is not possible. Heck, it's not even possible to grep for all instances of %-formatting. How do you suppose that maintainers could ever do the transition from %- to {}-formatting manually?
If %-formatting is first deprecated then removed from Python and there is no automatic transition path that effectively means that existing code using %-formatting is forced to stay at whatever Python version was the last one supporting %-formatting. I surely hope nobody is seriously considering such a scenario. Perl 6 seems harmless in comparison. -- Christian Tanzer http://www.c-tanzer.at/

Christian Tanzer wrote:
That is vastly overstating it. Making 'with' and 'as' keywords and removing string exceptions (which have already happened) will affect far more programs than a minor incompatibility in transitioning string formatting. Michael

Michael Foord wrote at Thu, 08 Oct 2009 16:56:35 +0100:
`with` and `as` are trivial to fix and certainly not pervasive in existing code. String exceptions have been deprecated for years. -- Christian Tanzer http://www.c-tanzer.at/

On Thu, Oct 8, 2009 at 8:08 AM, Christian Tanzer <tanzer@swing.co.at> wrote:
This is pretty much the situation with integer division (you can only recognize it by running the code), and yet we figured a way to change that in 3.x. Or take classic classes vs. new-style classes. They cannot be translated 100% automatically either. The solution is to support the old and new style in parallel for a really long time -- we did this with int division (read PEP 238), we did it with classes, and we can do it again with formatting. Unless I missed something, we're not planning to remove %-formatting until Python 4.0 comes along, which we won't even start until a long time after everyone has switched to some version of 3.x. So the same approach will apply: support both forms, nudge people to start using the new form, wait, nudge some more, etc. So, yes, we will continue to make noise about this. And yes you should opportunistically migrate your code to {}-formatting, like when you're rewriting some code anyway. One of the nice things about {}-formatting is that in most cases (things like the logging API excluded) you can change it one format string at a time. And no, the sky isn't falling. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

[Christian Tanzer]
How do you suppose that maintainers could ever do the transition from %- to {}-formatting manually?
[Guido van Rossum]
This is pretty much the situation with integer division (you can only recognize it by running the code),
Do you think there may be some possible parallel to the -3 option to flag cases of %-formatting? If so, that could be helpful.
I've already have some code that mixes the styles (using {} for new stuff). Raymond

On Thu, Oct 8, 2009 at 10:14 AM, Raymond Hettinger <python@rcn.com> wrote:
Do you think there may be some possible parallel to the -3 option to flag cases of %-formatting? If so, that could be helpful.
Absolutely. This should be simple, since there's just one or two places where to place the warning. We might also automatically turn it on when Python 2.7 is run with -3. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

I didn't get your message, could you resend?.
Resent, it may have been stopped by your spam filters since it came from my vinay-underscore-sajip-at-red-hyphen-dove-dot-com address. The subject was "Python str.format() and octal formatting compatibility".
I don't much mind matter exactly which mechanism we use to distinguish between 0o and 0 prefixes, as long as it's one most people are happy with :-) Regards, Vinay Sajip

On approximately 10/8/2009 7:24 AM, came the following characters from the keyboard of Eric Smith:
Seems like the ability for Python {} formatting to be able to match not only old Python % formatting output, but also output created by C's sprintf, and other numeric formatting systems, make this particular feature useful in more scenarios than a "backward compatibility hack". If you want to replace a C program that produces parsed output in a given format, and that given format includes leading-0-octal numbers, then it would be good to have the capability in Python .format, even though Python itself uses 0o prefix. Similar arguments may apply anywhere else that sprintf produces something that .format cannot currently produce. -- Glenn -- http://nevcal.com/ =========================== A protocol is complete when there is nothing left to remove. -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

On approximately 9/30/2009 4:03 PM, came the following characters from the keyboard of Vinay Sajip:
It seems to me that most of the discussion of in this thread is concerned with the first issue... and yet I see the second as the harder issue, and it has gotten less press. Abstracting this away from logger, I think the problem has three cases: 1) Both the format message and all the parameters are supplied in a single API call. This is really a foolish API, because def API( fmt, p1, p2, p3 ): str = fmt % (p1, p2, p3) could have just as easily been documented originally as def API( str ): where the user is welcome to supply a string such as API( fmt % (p1, p2, p3 )) and if done this way, the conversion to .format is obvious... and all under the users control. 2) The format message and the parameters are supplied to separate APIs, because the format message is common to many invocations of the other APIs that supply parameters, and is cached by the API. This is sufficient to break the foolishness of #1, but is really just a subset of #3, so any solutions to #3 apply here. 3) The format message and the parameters for it may be supplied by the same or separate APIs, but one or both are incomplete, and are augmented by the API. In other words, one or both of the following cases: 3a) The user supplied format message may include references to named parameters that are documented by the API, and supplied by the API, rather than by the user. 3b) The user supplied format string may be embedded into a larger format string by the API, which contains references to other values that the user must also supply. In either case of 3a or 3b, the user has insufficient information to perform the whole format operation and pass the result to the API. In both cases, the API that accepts the format string must be informed whether it is a % or {} string, somehow. This could be supplied to the API that accepts the string, or to some other related API that sets a format mode. Internally, the code would have to be able to manipulate both types of formats.
The above three paragraphs are unclear to me. I think they might be referring to case 2 or 3, though.
It seems that the above is only referring to case 1? And doesn't help with case 2 or 3?
Agreed here... a single global state would not make modular upgrades to a complex program easy... the state would be best included with particular instance objects, especially when such instance objects exist already. The format type parameter could be provided to the instance, instead of globally.
These last 3 paragraphs seem to be very related to logger, specifically. The first of the 3 does point out a concern for systems that interoperate across networks: if the format strings and parameters are exposed separately across networks, whatever types are sent must be usable at the receiver, or at least appropriate version control must be required so that incompatible systems can be detected and reported. On approximately 9/30/2009 5:47 PM, came the following characters from the keyboard of Antoine Pitrou:
This "callable" technique seems to only support case 1 and 2, but not 3, unless I misunderstand it. -- Glenn -- http://nevcal.com/ =========================== A protocol is complete when there is nothing left to remove. -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

Glenn Linderman wrote:
The lazy APIs actually make a lot of sense, particularly when there is a chance that the function being called may be able to avoid the formatting call altogether. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
participants (26)
-
"Martin v. Löwis"
-
Antoine Pitrou
-
Barry Scott
-
Barry Warsaw
-
Benjamin Peterson
-
Brett Cannon
-
Carl Trachte
-
Christian Tanzer
-
Eric Smith
-
Glenn Linderman
-
Glenn Linderman
-
Greg Ewing
-
Guido van Rossum
-
INADA Naoki
-
James Y Knight
-
M.-A. Lemburg
-
Martin Geisler
-
Michael Foord
-
MRAB
-
Nick Coghlan
-
Paul Moore
-
Raymond Hettinger
-
Steven Bethard
-
Steven D'Aprano
-
Terry Reedy
-
Vinay Sajip