[Python-Dev] PEP-498: Literal String Formatting

Mon Aug 10 23:51:41 CEST 2015

On Aug 10, 2015 11:33 AM, "Barry Warsaw" <barry at python.org> wrote:
>
> On Aug 11, 2015, at 03:26 AM, Steven D'Aprano wrote:
>
> >I think I would be happy with f-strings, or perhaps i-strings if we use
> >Nick's ideas about internationalisation, and limit what they can
evaluate to
> >name lookups, attribute lookups, and indexing, just like format().
>
> I still think you really only need name lookups, especially for an i18n
> context.  Anything else is just overkill, YAGNI, potentially error prone,
or
> perhaps even harmful.
>
> Remember that the translated strings usually come from only moderately
(if at
> all) trusted and verified sources, so it's entirely possible that a
malicious
> translator could sneak in an exploit, especially if you're evaluating
> arbitrary expressions.  If you're only doing name substitutions, then the
> worst that can happen is an information leak, which is bad, but won't
> compromise the integrity of say a server using the translation.
>
> Even if the source strings avoid the use of expressions, if the feature is
> available, a translator could still sneak something in.  That pretty much
> makes it a non-starter for i18n, IMHO.
>
> Besides, any expression you have to calculate can go in a local that will
get
> interpolated.  The same goes for any !r or other formatting modifiers.
In an
> i18n context, you want to stick to the simplest possible substitution
> placeholders.

IIUC what Nick contemplates in PEP 501 is that when you write something like
  i"I am ${self.age}"
then the python runtime would itself evaluate self.age and pass it on to
the i18n machinery to do the actual substitution; the i18n machinery
wouldn't even contain any calls to eval. The above string could be
translated as
  i"Tengo ${self.age} años"
but
  i"Tengo ${self.password} años"
would be an error, because the runtime did not provide a value for
self.password. So while arbitrarily complex expressions are allowed (at
least as far as the language is concerned -- a given project or i18n
toolkit could impose additional policy restrictions), by the time the
interpolation machinery runs they'll effectively have been reduced to local
variables with funny multi-token names.

This pretty much eliminates all the information leak and exploit concerns,
AFAICT. From your comments about having to be careful about attribute
chasing, it sounds like it might even be more robust than current
flufl.i18n in this regard...?

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20150810/96e0b373/attachment.html>