[Python-3000] features i'd like [Python 3000?] ... #4: interpolated strings ala perl

Ben Wing ben at 666.com
Mon Dec 4 11:08:58 CET 2006


i'd already posted this to python-dev, but someone suggested is would 
belong more on python-3000, so i'm posting it here.  hope that's ok.

i see in PEP 3101 that there's some work going on to fix up the string
formatting capabilities of python.  it looks good to me but it still
doesn't really address the lack of a simple interpolated string
mechanism, as in perl or ruby.  i find myself constantly writing stuff like

text="Family: %s" % self.name

maybe_errout("%s, line %s: %s\n" % (title, lineno, errstr))

    def __str__(self):
        return "CCGFeatval(%s, parents=%s, licensing=%s)" % (
            (self.name, self.parents, self.licensing))

and lots of similar examples that are just crying out for perl-style
variable interpolation.  the proposals in PEP 3101 don't help much; i'd
get instead something like

maybe_errout("{0}, line {1}: {2}\n".format(title, lineno, errstr))

which isn't any better than the current % notation, or something like

maybe_errout("{title}, line {lineno}: {errstr}\n".format(
                     title=title, lineno=lineno, errstr=errstr))

where i have to repeat each interpolated variable three times.  yuck
yuck yuck.

how about something nice like

maybe_errout(i"[title], line [lineno]: [errstr]\n")

or (with the first example above)

text=i"Family: [self.name]"

or (third example above)

    def __str__(self):
        return i"CCGFeatval([self.name], parents=[self.parents], 
licensing=[self.licensing])"

(that should be one line if it gets broken)

the advantage of these is the same as for all such interpolations: the
interpolated variable is logically placed exactly where it will be
substituted, rather than somewhere else, with the brain needing to do
some additional cross-referencing.

`i' in front of a string indicates an "interpolated" string just like
`r' in the same position means "raw string".  if you think this is too
invisible, you could maybe use `in' or something easier to see.
however, i could see a case being made for combining both `i' and `r' on
the same string, and so using a single letter seems to make the most sense.

formatting params can follow, e.g.

  print i"The value of [stock[x]] is [stockval[x]:%6.2f]"

some comments:

1. i suggest brackets here so as to parallel but not interfere with PEP
3101, which uses braces; PEP 3101 is somewhat orthogonal to this
proposal and the two might want to coexist.  i think using something
like brackets or braces is better than perl's $ convention (it
explicitly indicates the boundaries of the interpolation) and simpler
than ruby's #{...} or make's ${...}; since strings are only interpolated
if you specifically indicate this, there's less need to use funny
characters to avoid accidental interpolation.
2. as shown in the last example, format specifiers can be added to an
interpolation, as in PEP 3101.  maybe the % is unnecessary.
3. as shown in various examples, things other than just straight
variables can be interpolated.  the above examples include array/hash
references and object attributes.  it's not exactly clear what should
and shouldn't be allowed.  one possibility is just to allow an arbitrary
expression.  PEP 3101 isn't going to do this because it's working on
existing strings (including ones that may come from the user), and so
allowing arbitrary expressions could lead to security holes.  but here,
we're talking about immediate strings, so security holes of this sort
should not a concern.
4. the semantics of an interpolated string are exactly that of a series
of string concatenations, e.g.

return i"CCGFeatval([self.name], parents=[self.parents], 
licensing=[self.licensing])"

(should be one line)

is equivalent to

return "CCGFeatval(" + self.name + ", parents=" + self.parents + \
        ", licensing=" + self.licensing + ")"

ben





More information about the Python-3000 mailing list