On 08/21/2015 07:49 AM, Nick Coghlan wrote:
On 21 August 2015 at 21:06, Nathaniel Smith <njs@pobox.com> wrote:
On Aug 20, 2015 23:40, "Nick Coghlan" <ncoghlan@gmail.com> wrote: [...] myquery = i"SELECT $column FROM $table;" mycommand = i"cat $filename" mypage = i"<html><body>$content</body></html>"
It's the opposite of the "interpolating untrusted strings that may contain aribtrary expressions" problem - what happens when the variables being *substituted* are untrusted? It's easy to say "don't do that", but if doing the right thing incurs all the repetition currently involved in calling str.format, we're going to see a *lot* of people doing the wrong thing. At that point, the JavaScript backticks-with-arbitrary-named-callable solution starts looking very attractive:
myquery = sql`SELECT $column FROM $table;` mycommand = sh`cat $filename` mypage = html`<html><body>$content</body></html>`
Surely if using backticks we would drop the ugly prefix syntax and just make it a function call?
Not really, no, as `obj` already means repr(obj) in Python 2, and we can't silently make it do something else in Python 3 (although we can break it noisily and thus strongly encourage folks to switch to using the builtin instead).
The attractiveness of "little bobby tables" [1] vulnerabilities with an interpolation syntax that *doesn't* support custom interpolation engines has switched me from being mildly interested in the idea of good support for SQL, shell command and HTML generation to considering it a necessary capability, though.
The various string interpolation proposals are conflating two things: 1: extracting the expressions from the source string, and evaluating them in the correct context, and 2: taking the source string and the evaluated values, and building the resulting string. The problem is that in #1, the compiler has to be in on what's going on. That's because this problem can't be solved with normal function calls. So if normal function calls can't do it, what choices do we have? Either syntax, or special function names known to the compiler. I think syntax is clearly the right choice here. The only syntax changes that anyone has come up with so far are string prefixes, maybe suffixes, and back-ticks (ick). Of those, prefixes make the most sense. I'm interested in other suggestions, though. (Since I wrote this, I see Barry's import-based approach, but it's similar: instructions to the compiler.) Yuri's proposal was to implement #1 by having _any_ string prefix trigger the compiler to get involved to extract the source string and the compute the values. Then for #2, he invoked normal function calls, derived from the string prefix. He also loosened the restriction that strings would be the result: because any function could be invoked with the source string and the values, that function could return anything. If you really want string interpolation to be extensible to domains such as SQL and HTML, then I think an approach like Yuri's is the only way to do it: some syntax to tell the compiler to treat a string differently, coupled with some user-specifiable function that gets called to do the real work, and no need for the result to be a string. Eric.