Yeah, huge +1 on this. A previous workplace of mine used C# and while I
always sorely missed the ORMs available in Python (nothing in the C#
ecosystem even *remotely* compares to sqlalchemy, not even LINQ to SQL),
their FormattableString class always made me vaguely jealous when working
with databases.
Everyone is focusing on the use-cases for preventing injection attacks
(database queries, shell commands, etc.), which, fair enough, that is
*very* a strong argument in itself. But there are many other contexts in
which i-strings would provide a better way of doing something than we
currently have.
I've always hated writing anything that's intended to be code as a string,
and one of the most common examples of this currently is how we specify log
formats. From the Python documentation:
FORMAT = '%(asctime)-15s %(clientip)s %(user)-8s %(message)s'
Which could become something like:
from logging import record_template as rt
FORMAT = i'{rt.asctime:15} {rt.clientip} {rt.user:8} {rt.message}'
And now linters and autocompletion engines would play nice with this
(interpreting the content of the braces as code just like they do for
f-strings) and warn you if your format specifier was invalid, or if you had
a typo in one of your log record attributes etc. Plus, you wouldn't
actually need to context-switch to the python documentation on your browser
to look up what exactly those log attributes are again. You'd have them
right there in your IDE as autocompletions. This makes me happy :)
I'm sure there are many more such templating use-cases out in the wild that
could benefit in a small way from i-strings.
On Sat, May 8, 2021 at 9:48 AM Steven D'Aprano
On Fri, May 07, 2021 at 01:40:19PM -0600, Nick Humrich wrote:
PEP 501 already mentions how templates (i-strings?) can solve injection.
I don't think it does. The PEP only mentions the substring "inject" five times:
- once in the table of contents;
- twice to describe how f-strings may be vulnerable to code injection attacks;
- once as a section header "Handling code injection attacks"
- and once in that section to describe how third-party libraries can provide "case specific renderers that take care of quoting interpolated values appropriately for the relevant security context".
It seems to me that PEP 501 itself doesn't provide any further protection from code injection attacks than do existing solutions.
The PEP gives this simple example:
os.system(f"echo {message_from_user}")
If `message_from_user` has the value:
message_from_user = 'pwned; rm /'
then you're going to have a bad time. Your i-strings are no safer:
os.system(i"echo {message_from_user}")
gives no protection from untrusted input than the f-string version does. Merely delaying execution alone doesn't help.
A naive implementation of `os.system` will just run `format()` on the interpolation template. Without an appropriate renderer, there's no security gain, and PEP 501 explicitly states that it is up to third-parties to provide renderers (at least initially).
A serious danger is that people will naively, and wrongly, think that they should format the i-string themselves:
os.system(format(i"echo {message_from_user}"))
and thus defeat any renderers which os.system may provide. And that in turn will surely lead people to optimise the code to:
os.system(f"echo {message_from_user}")
I believe that if you are interested in preventing code injection attacks, it would be much better to introduce tainted and untainted strings: all non-literal strings are assumed tainted until explicitly escaped and flagged as untainted, after which they are considered safe to use.
Either that or use the approach favoured by the stdlib: pass a string template and a tuple of values which are then quoted before being interpolated into the template.
The bottom line here is that I think you are exaggerating the benefits of PEP 501 i-strings to "completely remove injection attacks".
Even if i-strings did everything you want, to *completely* remove injection attacks would require *all* such functions:
* eval, exec * os.system * sql
etc break backwards compatibility by no longer supporting string inputs at all, only interpolation template objects. So long as we *can* pass a plain old string to such functions, somebody *will* pass strings, and some of those will be tainted.
-- Steve _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/DM3SUF... Code of Conduct: http://python.org/psf/codeofconduct/