[Python-ideas] Draft PEP on string interpolation
Mike Miller
python-ideas at mgmiller.net
Tue Aug 25 21:06:55 CEST 2015
TL;DR: (Version 2, hopefully more clear)
Let's discuss whether to make "doing the right thing as easy as doing the wrong
thing" a desired goal for string interpolation.
Details -- we could:
1) Automatically escape potentially dangerous input variables to sensitive
functions, or
2) Make developers do it the hard way, making them completely responsible
for safety, and always responsible.
(Knowing that often they don't).
3) Some combination of the two.
A trivial implementation of 1) is below. Instead of rendering the string
immediately, it is deferred until use, with template and parameters stashed
inside an object, allowing the receiver to specify escaping/quoting rules.
---------------------------------
Let's call these e-strings (for expression), as it's easier to refer to the
letter of the proposals than three digit numbers.
So, an e-string looks like an f-string, though at compile-time, it is converted
to an object instead (like an i-string):
print(e'Hello {friend}, filename: {filename}.') # converts to ==>
print(estr('Hello {friend}, filename: {filename}.', friend=friend,
filename=filename))
An estr is a subclass of str, therefore able to do the nice things a string can
do. Rendering is deferred until the variable is used, and it also has a .raw
member, escape(), and translate() methods:
class estr(str):
# init: saves self.raw, args, kwargs for later
# methods, ops render it
# def escape(self, escape_func): # handles escaping
# def translate(self, template, safe=True): # optional i18n support
To make it as simple as possible to use by end-developers, it:
1) Doesn't require str() to be run explicitly, it renders itself when
needed via its various methods and operators.
Look for .raw, if you need the original. Also,
2) A bit of responsibility is pushed to stdlib/pypi. In a handful of
sensitive places, the object is checked beforehand and escaped when
needed:
# imagine html, db, subprocess input etc.
def sensitive_func_that_escapes(input):
if isinstance(input, estr):
input = input.escape(shlex.quote) # each chooses its own rules
do_something(input)
This means numerous callers using e-strings won't have to do explicit escaping,
only a handful of callee libraries will--which is common with database apis, for
example. What is easiest to type is now safe as well::
sensitive_func_that_escapes_input(e'user input: {input}') # sleep easy
This could enable the safety and features we'd like, without burdening the
everyday user. I've created a sample script to demonstrate at:
https://bitbucket.org/mixmastamyk/docs/src/default/pep/estring_example.py
Here is the output:
# consider: e'Hello {friend}, filename: {filename}.'
friend: 'John'
filename: "somefile; rm -rf ~ 'foo' <html>"
original: Hello {friend}, filename: {filename}.
w/ print(): Hello John, filename: somefile; rm -rf ~ 'foo' <html>.
shell escape:
Hello John, filename: 'somefile; rm -rf ~ '"'"'foo'"'"' <html>'.
html escape:
Hello John, filename: somefile; rm -rf ~ 'foo' <html>.
sql escape: Hello "John", filename: "somefile; rm -rf ~ 'foo' <html>".
logger DEBUG Hello John, filename: somefile; rm -rf ~ 'foo' <html>.
upper+encode: b"HELLO JOHN, FILENAME: SOMEFILE; RM -RF ~ 'FOO' <HTML>."
translated?: Hola John, archivo: somefile; rm -rf ~ 'foo' <html>.
Is this automatic escaping desired? Or should we continue to make the
end-developer fully responsible for escaping input?
-Mike
More information about the Python-ideas
mailing list