New Python 3.0 string formatting - really necessary?

Sun Dec 21 08:52:27 EST 2008

On Sun, 21 Dec 2008 12:45:32 +0000, Duncan Booth wrote:

> Steven D'Aprano <steve at REMOVE-THIS-cybersource.com.au> wrote:
> 
>> Errors should never pass silently, unless explicitly silenced. You have
>> implicitly silenced the TypeError you get from not having enough
>> arguments for the first format operation. That means that you will
>> introduce ambiguity and bugs.
>> 
>> "%i %i %i %i" % 5 % 3 %7
>> 
>> Here I have four slots and only three numbers. Which output did I
>> expect?
>> 
>> '%i 5 3 7'
>> '5 %i 3 7'
>> '5 3 %i 7'
>> '5 3 7 %i'
>> 
>> Or more likely, the three numbers is a mistake, there is supposed to be
>> a fourth number there somewhere, only now instead of the error being
>> caught immediately, it won't be discovered until much later.
>> 
> You seem to have made an unwarranted assumption, namely that a binary
> operator has to compile to a function with two operands. There is no
> particular reason why this has to always be the case: for example, I
> believe that C# when given several strings to add together optimises
> this into a single call to a concatenation method.

[...]

> Python *could* do something similar if the appropriate opcodes/methods
> supported more than two arguments:
> 
> a+b+c+d might execute a.__add__(b,c,d) allowing more efficient string
> concatenations or matrix operations, and a%b%c%d might execute as
> a.__mod__(b,c,d).

That's only plausible if the operations are associative. Addition is 
associative, but string interpolation is not:

>>> "%%%s" % ("%s" % "b")
'%b'
>>> ("%%%s" % "%s") % "b"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: not all arguments converted during string formatting

Since string interpolation isn't associative, your hypothetical __mod__ 
method might take multiple arguments, but it would have to deal with them 
two at a time, unlike concatenation where the compiler could do them all 
at once. So whether __mod__ takes two arguments or many is irrelevant: 
its implementation must rely on some other function which takes two 
arguments and must succeed or fail on that.

Either that, or we change the design of % interpolation, and allow it to 
silently ignore errors. I assumed that is what Aaron wanted.

> In that alternate universe your example:
> 
>     	"%i %i %i %i" % 5 % 3 %7
> 
> simply throws "TypeError: not enough arguments for format string"

That has a disturbing consequence.

Consider that most (all?) operations, we can use temporary values:

x = 1 + 2 + 3 + 4
=> x == 10

gives the same value for x as:

temp = 1 + 2 + 3
x = temp + 4

I would expect that the same should happen for % interpolation:

# using Aaron's hypothetical syntax
s = "%s.%s.%s.%s" % 1 % 2 % 3 % 4
=> "1.2.3.4"

should give the same result as:

temp = "%s.%s.%s.%s" % 1 % 2 % 3
s = temp % 4

But you're arguing that the first version should succeed and the second 
version, using a temporary value, should fail. And that implies that if 
you group part of the expression in parentheses, it will fail as well:

s = ("%s.%s.%s.%s" % 1 % 2 % 3) % 4

Remove the parentheses, and it succeeds. That's disturbing. That makes 
the % operator behave very differently from other operators.

Note that with the current syntax, we don't have that problem: short-
supplying arguments leads to an exception no matter what.

>       "%s" % (1,2,3)
> 
> just converts the tuple as a single argument. It also provides the
> answer to how you put a percent in the format string (double it) 

I trust you know that already works, but just in case:

>>> "%g%%" % 12.5
'12.5%'

> and what happens if a substitution inserts a percent (it doesn't
> interact with the formatting operators).

Ditto:

>>> "%s" % "%g"
'%g'

-- 
Steven