[Python-ideas] Draft PEP on string interpolation

Mon Aug 24 04:31:58 CEST 2015

On Sun, Aug 23, 2015 at 8:41 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 24 August 2015 at 10:35, Eric V. Smith <eric at trueblade.com> wrote:
> > On 08/22/2015 09:37 PM, Nick Coghlan wrote:
> >> The trick would be to make interpolation lazy *by default* (preserving
> >> the triple of the raw template string, the parsed fields, and the
> >> expression values), and put the default rendering in the resulting
> >> object's *__str__* method.
> >
> > At this point, I think PEPs 498 and 501 have converged, except for the
> > delayed string interpolation object (which I realize is important) and
> > how expressions are identified in the strings (which I consider less
> > important).
> >
> > I think the string interpolation object is interesting. It's basically
> > what Petr Viktorin and Chris Angelico discussed and suggested here:
> > https://mail.python.org/pipermail/python-ideas/2015-August/035303.html.
>
> Aha, I though I'd seen that idea go by in one of the threads, but I
> didn't remember where :)
>
> I'll add Petr and Chris to the acknowledgements section in 501.
>
> > My suggestion would be to add both f-strings (PEP 498) and i-strings (as
> > they're currently called in PEP 501), but with the exact same syntax to
> > identify and evaluate expressions. I don't particularly care what the
> > prefixes are. I'd add the plain f-strings first, then i-strings maybe
> > later. There are definitely some issues with delayed interpolation we
> > need to think about. An f-string would be shorthand for str(i-string).
>
> +1, as this is the point of view I've come to as well.
>
> > I think it's hyperbolic to refers f-strings as a new string formatting
> > language. With one small difference (detailed in PEP 498, and with zero
> > usage I could find in the stdlib outside of tests), f-strings are a
> > strict superset of str.format() strings (but not the arguments to
> > .format of course). I think f-strings are no more different from
> > str.format strings than PEP 501 i-strings are to string.Template strings.
>
> Yeah, that's a fair criticism of my rhetoric, so I'll stop saying that.
>
> > From what I can tell in the stdlib and in the wild, str.format() has
> > hundreds or thousands of times more usage that string.Template. I
> > realize that the reasons are not necessarily related to the syntax of
> > the replacement strings, but you can't say most people aren't familiar
> > with str.format().
>
> Right, and I think we can actually make an example driven decision on
> that front by looking at potential *target* formats for template
> rendering. After all, one of the interesting discoveries we made in
> having both str.__mod__ and str.format available is that %-formatting
> is a great way to template str.format strings, and vice-versa, since
> the meta-characters don't conflict, so you can minimise the escaping
> needed.
>
> For use cases like writing object __repr__ methods, I don't think the
> choice of $-substitution or {}-substitution matters - neither $ nor {}
> are likely to appear in the desired output (except as part of
> interpolated values), so escaping shouldn't be common regardless of
> which we choose. (Side note: __repr__ and _str__ implementations are
> likely worth highlighting as a good use case for the new syntax!)
>
> I think things get more interesting once we start talking about
> interpolation targets other than "human readable text".
>
> For example, one of the neat (/scary, depending on how you feel about
> this kind of feature) things I realised in working on the latest draft
> of PEP 501 is that you could use it to template *Python code*,
> including eagerly bound references to objects in the current scope.
> That is:
>
>     a = b + c
>
> could instead be written as:
>
>     a = eval(str(i"$b + $c"))
>
> That's not very interesting if all you do is immediately call eval()
> on it, but it's a lot more interesting if you instead want to do
> things like extract the AST, dispatch the operation for execution in
> another process, etc. For example, you could use this capability to
> build eagerly bound closures, which wouldn't see changes in name
> bindings, but *would* see state changes in mutable objects.
>
> With $-substitution, that "just works", as $ generally isn't
> syntactically significant in Python code - it can only appear inside
> strings (and potentially interpolation templates). With
> {}-substitution, you'd have to double all the braces for dictionary
> displays, dictionary comprehensions and set comprehensions. In example
> form:
>
>     data = {k:v for k, v in source}
>
> becomes:
>
>     data = eval(str(i"{k:v for k, v in $source}"))
>
> rather than:
>
>     data = eval(f"{{k:v for k, v in {{source}}}}"))
>
> You hit a similar problem if you're targeting Django or Jinja2
> templates, or any content that involves l20n style JavaScript
> translation strings: the use of braces for substitution expressions in
> the interpolation template conflicts with their use in the target
> format.
>
> So far, the only target rendering environments I've come up with where
> $-substitution would create a conflict are shell commands and
> JavaScript localisation using Mozilla's l20n syntax, and in both of
> those, I'd actually *want* the Python lookup to take precedence over
> the target environment lookup (and doubling the prefix to "$$" for
> target environment lookup seems quite reasonable when you actually do
> want to do the name lookup in the target environment).
>
> >> That description is probably as clear as mud, though, so back to the
> >> PEP I go! :)
> >
> > Thanks for PEP 501. Maybe I'll add delayed interpolation to PEP 498!
> >
> > On a more serious note, I'm thinking of adding i-strings to my f-string
> > implementation. I have some ideas that the format_spec (the :.3f stuff)
> > could be used by the code that eventually does the string interpolation.
> > For example, sql(i-string) might want to interpret this expression using
> > __sql__, instead of how str(i-string) would use __format__. Then the
> > sql() machinery could look at the format_spec and pass it to the value's
> > __sql__ method.
>
> Yeah, that's the key reason PEP 501 is careful to treat them as opaque
> strings that it merely transports through to the renderer. The
> *default* renderer would expect them to be str.format format
> specifiers, but other renderers may either disallow them entirely, or
> expect them to do something different.
>
> > For example:
> > sql(i'select {date:as_date} from {tablename}'
> >
> > might call date.__sql__('as_date'), which would know how to cast to the
> > write datatype (this happens to me all the time).
> >
> > This is one reason I'm thinking of ditching !s, !r, and !a, at least for
> > the first implementation of PEP 498: they're not needed, and are not
> > generally applicable if we add the hooks I'm considering into i-strings.
>
> +1 from me. Given arbitrary expression support, it's both entirely
> possible and more explicit to write the builtin calls directly (obj!a,
> obj!r, obj!s -> ascii(obj), repr(obj), str(obj))
>

IIUC, to do this with SQL,

> sql(i'select {date:as_date} from {tablename}'

needs to be

  ['select ', unescaped(date, 'as_date'), 'from ', unescaped(tablename)]

so that e.g. sql_92(), sql_2011()
would know that 'select ' is presumably implicitly escaped

* https://en.wikipedia.org/wiki/SQL#Interoperability_and_standardization
* http://docs.sqlalchemy.org/en/rel_1_0/dialects/
* https://docs.djangoproject.com/en/1.7/ref/models/queries/#f-expressions
"Django F-Expressions"

> Regards,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150823/5eea9b4b/attachment-0001.html>