[Python-ideas] String interpolation for all literal strings
Wes Turner
wes.turner at gmail.com
Sat Aug 8 00:33:01 CEST 2015
On Fri, Aug 7, 2015 at 5:24 PM, Nikolaus Rath <Nikolaus at rath.org> wrote:
> On Aug 07 2015, Barry Warsaw <
> barry-+ZN9ApsXKcEdnm+yROfE0A at public.gmane.org> wrote:
> > * Literals only
> >
> > I've described elsewhere that accepting non-literals is useful in some
> > cases.
>
> Are you saying you don't want f-strings, but you want something that
> looks like a function (but is actually a special form because it has
> access to the local context)? E.g. f(other_fn()) would perform literal
> interpolation on the result of other_fn()?
>
> I think that would be a very bad idea. It introduces something that
> looks like a function but isn't and it opens the door to a new class of
> injection vulnerabilities (every time you return a string it could
> potentially be used for interpolation at some point).
>
glocals(), format_from(), lookup() (e.g. salt map.jinja stack of dicts)
Contexts:
* [Python-ideas] String interpolation for all literal strings
* 'this should not be a {cmd}'.format(cmd=cmd)
* 'this should not be a {cmd}'.format(globals() + locals() +
{'cmd':cmd'})
* 'this should not be a \{cmd}'
* f'this should not be a \{cmd}'
* [Python-ideas] Briefer string format
* [Python-ideas] Make non-meaningful backslashes illegal in string literals
* u'C:\users' breaks because \u is an escape sequence
* How does this interact with string interpolation
(e.g. **when**, in the functional composition
from string to string (with parameters),
do these escape sequences get eval'd?
* See: MarkupSafe (Jinja2)
Justification:
* "how are the resources shared relevant to these discussions?"
* TL;DR
* string interpolation is often dangerous
(OS Command Injection and SQL Injection are the #1 and #2
according to the CWE/SANS 2011 Top 25)
* string interpolation is already hard to review
(because there are many ways to do it)
* it's a functional composition of an AST?
* Shared a number of seemingly tangential links
(in python-ideas) in regards to
proposals to add an additional string interpolation syntax
with implicit local then global context / scope
tentatively called 'f-strings'.
* Bikeshedded on the \{syntax} ({{because}} {these} \{are\} more
readable)
* Bikeshedded on the name 'f-string',
because of visual disambiguability
from 'r-string' (for e.g. raw strings (and e.g. ``re``))
* Is there an AST scanner to find these?
* Because a grep expression for ``f"`` or ``f'`` is not that
helpful.
* Especially as compared to ``grep ".format("``
Use Cases:
----------
As a developer, I want to:
* grep, grep for string interpolations
* include parameters in strings (and escape them appropriateyl)
* The safer thing to do is
should *usually* (often) be tokenized
and e.g. quoted and serialized out
* OS Commands, HTML DOM, SQL parse tree, SPARQL parse tree,
CSV, TSV,
(*injection* vectors with user supplied input
and non-binary string-based data representation formats)
* "Explicit is better than implicit" -- Zen of Python
* Where are the values of these variables set?
With *non* f-strings (str.format, str.__mod__)
the context is explicit;
and I regard that as a feature of Python.
* If what is needed is a shorthand way to say
* ``glocals(**kwargs) / gl()``
* ``lookup_from({}, locals(), globals())``,
* ``.formatlookup(`` or ``.formatl(``
and/or not add a backwards-incompatible shortcut
which is going to require additional review
(as I am reviewing things that are commands or queries).
* These are usually trees of tokens which are serialized
for a particular context;
and they are difficult because
we often don't think of them
in the same terms as say the Python AST;
because we think we can just use string concatenation here
(when there should/could be typed objects
with serialization methods e.g
* __str__
* __str_shell__
* __str_sql__(_, with_keywords=SQLVARIANT_KEYWORDS)
With this form, the proposed f-string method would be:
* __interpolate__
* [ ] Manual review
* Which variables/expressions are defined or referenced here,
syntax checker?
* There are 3 other string interpolation syntaxes.
* ``glocals(**kwargs) / gl()``
* **AND THEN**, "so I can just string-concatenate these now?"
* Again, MarkupSafe __attr
* Types and serialization over concatenation
>
> Best,
> -Nikolaus
>
> --
> GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
> Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
>
> »Time flies like an arrow, fruit flies like a Banana.«
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150807/1af708f1/attachment.html>
More information about the Python-ideas
mailing list