[Python-Dev] features i'd like [Python 3000?] ... #4: interpolated strings ala perl

Ben Wing ben at 666.com
Mon Dec 4 06:12:16 CET 2006


sorry to be casting multiple ideas at once to the list.  i've been 
looking into other languages recently and reading the recent PEP's and 
such and it's helped crystallize ideas about what could be better about 
python.

i see in PEP 3101 that there's some work going on to fix up the string 
formatting capabilities of python.  it looks good to me but it still 
doesn't really address the lack of a simple interpolated string 
mechanism, as in perl or ruby.  i find myself constantly writing stuff like

text="Family: %s" % self.name

maybe_errout("%s, line %s: %s\n" % (title, lineno, errstr))

    def __str__(self):
        return "CCGFeatval(%s, parents=%s, licensing=%s)" % (
            (self.name, self.parents, self.licensing))

and lots of similar examples that are just crying out for perl-style 
variable interpolation.  the proposals in PEP 3101 don't help much; i'd 
get instead something like

maybe_errout("{0}, line {1}: {2}\n".format(title, lineno, errstr))

which isn't any better than the current % notation, or something like

maybe_errout("{title}, line {lineno}: {errstr}\n".format(title=title, 
lineno=lineno, errstr=errstr))

where i have to repeat each interpolated variable three times.  yuck 
yuck yuck.

how about something nice like

maybe_errout(i"[title], line [lineno]: [errstr]\n")

or (with the first example above)

text=i"Family: [self.name]"

or (third example above)

    def __str__(self):
        return i"CCGFeatval([self.name], parents=[self.parents], 
licensing=[self.licensing])"

the advantage of these is the same as for all such interpolations: the 
interpolated variable is logically placed exactly where it will be 
substituted, rather than somewhere else, with the brain needing to do 
some additional cross-referencing.

`i' in front of a string indicates an "interpolated" string just like 
`r' in the same position means "raw string".  if you think this is too 
invisible, you could maybe use `in' or something easier to see.  
however, i could see a case being made for combining both `i' and `r' on 
the same string, and so using a single letter seems to make the most sense.

formatting params can follow, e.g.

  print i"The value of [stock[x]] is [stockval[x]:%6.2f]"

some comments:

1. i suggest brackets here so as to parallel but not interfere with PEP 
3101, which uses braces; PEP 3101 is somewhat orthogonal to this 
proposal and the two might want to coexist.  i think using something 
like brackets or braces is better than perl's $ convention (it 
explicitly indicates the boundaries of the interpolation) and simpler 
than ruby's #{...} or make's ${...}; since strings are only interpolated 
if you specifically indicate this, there's less need to use funny 
characters to avoid accidental interpolation.
2. as shown in the last example, format specifiers can be added to an 
interpolation, as in PEP 3101.  maybe the % is unnecessary.
3. as shown in various examples, things other than just straight 
variables can be interpolated.  the above examples include array/hash 
references and object attributes.  it's not exactly clear what should 
and shouldn't be allowed.  one possibility is just to allow an arbitrary 
expression.  PEP 3101 isn't going to do this because it's working on 
existing strings (including ones that may come from the user), and so 
allowing arbitrary expressions could lead to security holes.  but here, 
we're talking about immediate strings, so security holes of this sort 
should not a concern.
4. the semantics of an interpolated string are exactly that of a series 
of string concatenations, e.g.

return i"CCGFeatval([self.name], parents=[self.parents], 
licensing=[self.licensing])"

is equivalent to

return "CCGFeatval(" + self.name + ", parents=" + self.parents + ", 
licensing=" + self.licensing + ")"

ben




More information about the Python-Dev mailing list