[Python-Dev] *Simpler* string substitutions

Paul Prescod paul@prescod.net
Thu, 20 Jun 2002 11:29:33 -0700


We will never come to a solution unless we agree on what, if any, the
problem is.

Here is my sense of the "interpolation" problem (based entirely on the
code I see):

 * 95% of all scripts (or modules) need to do string interpolation

 * 5% of all scripts want to be explicit about the types

 * 10% of all scripts want to submit a dictionary rather than the
current namespace

 * 5% of all scripts want to do printf-style formatting tricks

Which means that if we do the math in a simplistic way, 20%
modules/scripts need these complicated features but the other 75% pay
for these features that they are not using. They pay through having to
use "% locals()" (which uses two advanced features of Python, operator
overloading and the local namespace). They pay through counting the
lengths of their %-tuples (in my case, usually miscounting). They pay
through adding (or forgetting to add) the format specifier after
"%(...)". They pay through having harder to read strings where they have
to go back and forth to figure out what various positional variables
mean. They through having to remember the special case for singletons --
except for singleton tuples!

Of course the syntax is flexible: you get to choose HOW you pay
(shifting from positional to name) and thus reduce some costs while you
incur others, but you can't choose simply NOT to pay, as you can in
every other scripting language I know. 

And remember that Python is a language that *encourages* readability.
But this kind of code is common:

 * exception.append('\n<br>%s%s&nbsp;=\n%s' % (indent, name, value))

whereas it could be just:

 * exception.append('\n<br>${ident}${name}&nbsp;=\n${value}')

Which is shorter, uses fewer concepts, and keeps variables close to
where they are used. We could argue that the programmer here made the
wrong choice (versus using % locals()) but the point is that Python
itself favoured the wrong choice by making the wrong choice shorter and
simpler. Usually Python favours the right choice.

The tax is small but it is collected on almost every script, almost
every beginner and almost every programmer almost every day. So it adds
up.

If we put this new feature in a module: (whether "text", "re",
"string"), then we are just divising another way to make people pay. At
that point it becomes a negative feature, because it will clutter up the
standard library without getting use.As long as you are agreeing to pay
some tax, "%" is a smaller tax (at least at first) because it does not
require you to interrupt your workflow to insert an import statement.

In my mind, this feature is only worth adding if we agree that it is now
the standard string interpolation feature and "%" becomes a quaint
historical feature -- a bad experiment in operator overloading gone
wrong. "%" could be renamed "text.printf" and would actually become more
familiar to its core constituency and less of a syntactic abberation.
"interp" could be a built-in and thus similarly simple syntactically.

But I am against adding "$" if half of Python programmers are going to
use that and half are going to use %. $ needs to be a replacement. There
should be one obvious way to solve simple problems like this, not two. I
am also against adding it as a useless function buried in a module that
nobody will bother to import.

 Paul Prescod