new string-formatting preferred? (was "What is this syntax ?")

Tue Jun 21 07:33:13 EDT 2011

On 06/20/2011 09:17 PM, Terry Reedy wrote:
> On 6/20/2011 8:46 PM, Tim Chase wrote:
>> On 06/20/2011 05:19 PM, Ben Finney wrote:
>>> “This method of string formatting is the new standard in
>>> Python 3.0, and should be preferred to the % formatting
>>> described in String Formatting Operations in new code.”
>>>
>>> <URL:http://docs.python.org/library/stdtypes.html#str.format>
>>
>> Is there a good link to a thread-archive on when/why/how .format(...)
>> became "preferred to the % formatting"?
>
> That is a controversial statement.

I'm not sure whether you're "controversial" refers to

- the documentation at that link,
- Ben's quote of the documentation at that link,
- my quotation of Ben's quote of the documentation,
- or my request for a "thread-archive on the when/why/how"

I _suspect_ you mean the first one :)

>> I haven't seen any great wins of the new formatting over
>> the  classic style.
>
> It does not abuse the '%' operator,

Weighed against the inertia of existing 
code/documentation/tutorials, I consider this a toss-up.  If 
.format() had been the preferred way since day#1, I'd grouse 
about adding/overloading '%', but going the other direction, 
there's such a large corpus of stuff using '%', the addition of 
.format() feels a bit schizophrenic.

> it does not make a special case of tuples (a source of bugs),

Having been stung occasionaly by this, I can see the benefit here 
over writing the less-blatant

   "whatever %s" % (tupleish,)

> and it is more flexible, especially
> indicating objects to be printed. Here is a simple example from my code
> that would be a bit more difficult with %.
>
> multi_warn = '''\
> Warning: testing multiple {0}s against an iterator will only test
> the first {0} unless the iterator is reiterable; most are not.'''.format
> ...
> print(multiwarn('function'))
> ...
> print(multiwarn('iterator'))

Does the gotcha of a non-restarting iterator trump pulling each 
field you want and passing it explicitly?  In pre-.format(), I'd 
just use dictionary formatting:

   "we have %(food)s & eggs and %(food)s, bacon & eggs" % {
     "food": "spam", # or my_iterator.next()?
     }

> class chunk():
> 	def __init__(self, a, b):
> 		self.a,self.b = a,b
> c=chunk(1, (3,'hi'))
> print('{0.__class__.__name__} object has attributes int a<{0.a}>
> and tuple b with members<{0.b[0]}>  and<{0.b[1]}>'.format(c))

This was one of the new features I saw, and I'm not sure how I 
feel about my strings knowing about my object structure.  It 
feels a bit like a violation of the old "separation of content 
and presentation".  Letting string-formatting reach deeply into 
objects makes it harder to swap out different object 
implementations purely by analyzing the code.  It also can put 
onus on translators to know about your object model if your 
format-strings come from an i18n source.  I also see possible 
mechanisms for malicious injection if the format-string comes 
from an untrusted source (unlikely, but I've seen enough bad code 
in production to make it at least imaginable).

The other new feature I saw was the use of __format__() which may 
have good use-cases, but I don't yet have a good example of when 
I'd want per-stringification formatting compared to just doing my 
desired formatting in __str__() instead.

So even with your examples of differences, I don't get an 
overwhelming feeling of "wow, that *is* a much better way!" like 
I did with some of the other new features such as "with" or 
changing "print" to a function.

Anyways, as you mention, I suspect blessing .format() as 
"preferred" in the documentation was a bit contentious...with 
enough code still running in 2.4 and 2.5 environments, it will be 
a long time until I even have to think about it.  I just wanted 
to watch a replay of the decision-makers bashing it out :)

-tkc