Re: [Python-ideas] Draft PEP on string interpolation

24 Aug 2015

      On Sun, Aug 23, 2015 at 9:31 PM, Wes Turner  wrote:
...
On Sun, Aug 23, 2015 at 8:41 PM, Nick Coghlan  wrote:
...
On 24 August 2015 at 10:35, Eric V. Smith  wrote:
...
On 08/22/2015 09:37 PM, Nick Coghlan wrote:
...
The trick would be to make interpolation lazy *by default* (preserving
the triple of the raw template string, the parsed fields, and the
expression values), and put the default rendering in the resulting
object's *__str__* method.
At this point, I think PEPs 498 and 501 have converged, except for the
delayed string interpolation object (which I realize is important) and
how expressions are identified in the strings (which I consider less
important).
I think the string interpolation object is interesting. It's basically
what Petr Viktorin and Chris Angelico discussed and suggested here:
https://mail.python.org/pipermail/python-ideas/2015-August/035303.html.
Aha, I though I'd seen that idea go by in one of the threads, but I
didn't remember where :)
I'll add Petr and Chris to the acknowledgements section in 501.
...
My suggestion would be to add both f-strings (PEP 498) and i-strings (as
they're currently called in PEP 501), but with the exact same syntax to
identify and evaluate expressions. I don't particularly care what the
prefixes are. I'd add the plain f-strings first, then i-strings maybe
later. There are definitely some issues with delayed interpolation we
need to think about. An f-string would be shorthand for str(i-string).
+1, as this is the point of view I've come to as well.
...
I think it's hyperbolic to refers f-strings as a new string formatting
language. With one small difference (detailed in PEP 498, and with zero
usage I could find in the stdlib outside of tests), f-strings are a
strict superset of str.format() strings (but not the arguments to
.format of course). I think f-strings are no more different from
str.format strings than PEP 501 i-strings are to string.Template
strings.
Yeah, that's a fair criticism of my rhetoric, so I'll stop saying that.
...
From what I can tell in the stdlib and in the wild, str.format() has
hundreds or thousands of times more usage that string.Template. I
realize that the reasons are not necessarily related to the syntax of
the replacement strings, but you can't say most people aren't familiar
with str.format().
Right, and I think we can actually make an example driven decision on
that front by looking at potential *target* formats for template
rendering. After all, one of the interesting discoveries we made in
having both str.__mod__ and str.format available is that %-formatting
is a great way to template str.format strings, and vice-versa, since
the meta-characters don't conflict, so you can minimise the escaping
needed.
For use cases like writing object __repr__ methods, I don't think the
choice of $-substitution or {}-substitution matters - neither $ nor {}
are likely to appear in the desired output (except as part of
interpolated values), so escaping shouldn't be common regardless of
which we choose. (Side note: __repr__ and _str__ implementations are
likely worth highlighting as a good use case for the new syntax!)
I think things get more interesting once we start talking about
interpolation targets other than "human readable text".
For example, one of the neat (/scary, depending on how you feel about
this kind of feature) things I realised in working on the latest draft
of PEP 501 is that you could use it to template *Python code*,
including eagerly bound references to objects in the current scope.
That is:
a = b + c
could instead be written as:
a = eval(str(i"$b + $c"))
That's not very interesting if all you do is immediately call eval()
on it, but it's a lot more interesting if you instead want to do
things like extract the AST, dispatch the operation for execution in
another process, etc. For example, you could use this capability to
build eagerly bound closures, which wouldn't see changes in name
bindings, but *would* see state changes in mutable objects.
With $-substitution, that "just works", as $ generally isn't
syntactically significant in Python code - it can only appear inside
strings (and potentially interpolation templates). With
{}-substitution, you'd have to double all the braces for dictionary
displays, dictionary comprehensions and set comprehensions. In example
form:
data = {k:v for k, v in source}
becomes:
data = eval(str(i"{k:v for k, v in $source}"))
rather than:
data = eval(f"{{k:v for k, v in {{source}}}}"))
You hit a similar problem if you're targeting Django or Jinja2
templates, or any content that involves l20n style JavaScript
translation strings: the use of braces for substitution expressions in
the interpolation template conflicts with their use in the target
format.
So far, the only target rendering environments I've come up with where
$-substitution would create a conflict are shell commands and
JavaScript localisation using Mozilla's l20n syntax, and in both of
those, I'd actually *want* the Python lookup to take precedence over
the target environment lookup (and doubling the prefix to "$$" for
target environment lookup seems quite reasonable when you actually do
want to do the name lookup in the target environment).
...
...
That description is probably as clear as mud, though, so back to the
PEP I go! :)
Thanks for PEP 501. Maybe I'll add delayed interpolation to PEP 498!
On a more serious note, I'm thinking of adding i-strings to my f-string
implementation. I have some ideas that the format_spec (the :.3f stuff)
could be used by the code that eventually does the string interpolation.
For example, sql(i-string) might want to interpret this expression using
__sql__, instead of how str(i-string) would use __format__. Then the
sql() machinery could look at the format_spec and pass it to the value's
__sql__ method.
Yeah, that's the key reason PEP 501 is careful to treat them as opaque
strings that it merely transports through to the renderer. The
*default* renderer would expect them to be str.format format
specifiers, but other renderers may either disallow them entirely, or
expect them to do something different.
...
For example:
sql(i'select {date:as_date} from {tablename}'
might call date.__sql__('as_date'), which would know how to cast to the
write datatype (this happens to me all the time).
This is one reason I'm thinking of ditching !s, !r, and !a, at least for
the first implementation of PEP 498: they're not needed, and are not
generally applicable if we add the hooks I'm considering into i-strings.
+1 from me. Given arbitrary expression support, it's both entirely
possible and more explicit to write the builtin calls directly (obj!a,
obj!r, obj!s -> ascii(obj), repr(obj), str(obj))
IIUC, to do this with SQL,
...
sql(i'select {date:as_date} from {tablename}'
needs to be
['select ', unescaped(date, 'as_date'), 'from ', unescaped(tablename)]
so that e.g. sql_92(), sql_2011()
would know that 'select ' is presumably implicitly escaped
* https://en.wikipedia.org/wiki/SQL#Interoperability_and_standardization
* http://docs.sqlalchemy.org/en/rel_1_0/dialects/
* https://docs.djangoproject.com/en/1.7/ref/models/queries/#f-expressions
"Django F-Expressions"
For reference, the SQLAlchemy Expression API solves for
(safer) method-chaining, nesting *Python* expression API;
or you can reuse a raw SQL connection from a ConnectionPool.

Django F-Objects are relevant because they are deferred
(and compiled in context to the query context);
similar to the objectives of a given SQL syntax
templating, parameterization, and serialization
library.

Django Q-Objects are similar,
in that an f-string is basically
an iterator of AND-ed expressions
where AND means string concatenation.

Personally,
I'd pretty much always just reflect the tables
or map them out
and write SQLAlchemy Python expressions
which are then compiled to a particular dialect
(and quoted appropriately, **avoiding CWE-89**
surviving across table renames,
managing migrations).

Is it sometimes faster to write SQL by hand?

* I'd write the [SQLAlchemy], serialize to SQL, [and modify]
  (because I should have namespaced Python table attrs for those attrs
anyway,
  even if it requires table introspection and reflection at (every/pool)
instantiation)
* you can always execute query with a raw connection with an ORM
  (and then **refactor (REF) string-ified table and column names**)

Each ORM (and DBAPI) have parametrization settings
(e.g. '%' or '?' or configuration_setting)
which should not collide with the f-string syntax.

* DBAPI v2.0
  https://www.python.org/dev/peps/pep-0249/
* SQLite DBAPI
  https://docs.python.org/2/library/sqlite3.html
  https://docs.python.org/3/library/sqlite3.html

http://docs.sqlalchemy.org/en/rel_1_0/core/tutorial.html#conjunctions
...
...
...
s = select([(users.c.fullname +...               ", " + addresses.c.email_address)....                label('title')]).\...        where(users.c.id == addresses.c.user_id).\...        where(users.c.name.between('m', 'z')).\...        where(...               or_(...                  addresses.c.email_address.like('%@aol.com'),...                  addresses.c.email_address.like('%@msn.com')...               )...        )>>> conn.execute(s).fetchall() SELECT users.fullname || ? || addresses.email_address AS titleFROM users, addressesWHERE users.id = addresses.user_id AND users.name BETWEEN ? AND ? AND(addresses.email_address LIKE ? OR addresses.email_address LIKE ?)(', ', 'm', 'z', '%@aol.com', '%@msn.com')[(u'Wendy Williams, wendy@aol.com',)]
http://docs.sqlalchemy.org/en/rel_1_0/core/tutorial.html#using-textual-sql
...
...
...
from sqlalchemy.sql import text>>> s = text(...     "SELECT users.fullname || ', ' || addresses.email_address AS title "...         "FROM users, addresses "...         "WHERE users.id = addresses.user_id "...         "AND users.name BETWEEN :x AND :y "...         "AND (addresses.email_address LIKE :e1 "...             "OR addresses.email_address LIKE :e2)")SQL http://docs.sqlalchemy.org/en/rel_1_0/core/tutorial.html#>>> conn.execute(s, x='m', y='z', e1='%@aol.com', e2='%@msn.com').fetchall() [(u'Wendy Williams, wendy@aol.com',)]
SQLAlchemy is not async-compatible
(besides, most drivers block);
it's debatable whether async would be faster, anyway:
https://bitbucket.org/zzzeek/sqlalchemy/issues/3414/asyncio-and-sqlalchemy
...
...
Regards,
Nick.
--
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/