[Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings)

Thu Aug 13 13:58:17 CEST 2015

On 08/13/2015 12:37 AM, Guido van Rossum wrote:
> On Wed, Aug 12, 2015 at 6:06 PM, Barry Warsaw <barry at python.org
> <mailto:barry at python.org>> wrote:

> 
>         placeholders = source_string.extract_placeholders()
>         substitutions = scope(*placeholders)
>         translated_string = i18n.lookup(source_string)
>         return translated_string.safe_substitute(substitutions)
> 
>     That would actually be quite useful.
> 
> 
> Agreed. But whereas you are quite happy having only simple variable
> names in i18n templates, the feature required for the non-i18n use case
> really needs arbitrary expressions. If we marry the two, your i18n code
> will just have to yell at the programmer if they use something too
> complex for the translators as a substitution. So possibly PEP 501 can
> be rescued. But I think we need separate prefixes for the PEP 498 and
> PEP 501 use cases; perhaps f'{...}' and _'{...}'. (But it would not be
> up to the compiler to limit the substitution syntax in _'{...}')

For the sake of the following argument, let's agree to disagree on:
- arbitrary expressions: we'll say yes
- string prefix character: we'll say 'f'
- how to identify expressions in a string: we'll say {...}

I promise we can bikeshed about these later. I'm just using the PEP 498
version because I'm more familiar with it.

And let's say that PEP 498 will take this:

name = 'Eric'
dog_name = 'Fluffy'
f"My name is {name}, my dog's name is {dog_name}"

And convert it to this (inspired by Victor):

"My name is {0}, my dog's name is {1}".format('Eric', 'Fluffy')
Resulting in:
"My name is Eric, my dog's name is Fluffy"

It seems to me that all you need for i18n is to instead make it produce:

__i18n__("My name is {0}, my dog's name is {1}").format('Eric', 'Fluffy')

The __i18n__ function would do whatever lookup is needed to produce the
translated string. So, in some English dialect where pet names had to
come first, it could return:
'The owner of the dog {1} is named {0}'

So the result would be:
'The owner of the dog Fluffy is named Eric'

I promise we can bikeshed about the name __i18n__.

So the translator has no say in how the expressions are evaluated. This
removes any concern about information leakage. If the source code said:
f"My name is {name}, my dog's name is {dog_name.upper()}"

then the string being passed to __i18n__ would remain unchanged. If by
convention you wanted to not use arbitrary expressions and just use
identifiers, then just make it a coding standard thing. It doesn't
affect the implementation one way or the other.

The default implementation for my proposed __i18n__ function (probably a
builtin) would be just to return its string argument. Then you get the
PEP 498 behavior. But in your module, you could say:
__i18n__ = gettext.gettext
and now you'd be using that machinery.

The one downside of this is that the strings that the translator is
translating from do not appear in the source code. The translator would
have to know that the string being translated is:
"My name is {0}, my dog's name is {1}"

But since this only operates on f-string literals, you could
mechanically extract them from the source. For example, given the
example f-string above, my current PEP 498 implementation returns this:

'Module(body=[Expr(value=FormattedStr(value=Call(func=Attribute(value=Str(s="My
name is {0}, my dog\'s name is {1}"), attr=\'format\', ctx=Load()),
args=[Name(id=\'name\', ctx=Load()), Name(id=\'dog_name\', ctx=Load())],
keywords=[])))])'

So the translatable string can easily be extracted from the ast. I could
modify the FormattedStr node to make that string easier to find.

Eric.