[Python-Dev] Re: PEP 292, Simpler String Substitutions

François Pinard pinard@iro.umontreal.ca
22 Jun 2002 09:52:07 -0400

[Oren Tirosh]

> On Thu, Jun 20, 2002 at 03:48:52PM -0700, Ka-Ping Yee wrote:
> > Using compile-time parsing, as in PEP 215, has the advantage that it
> > avoids any possible security problems; but it also eliminates the
> > possibility of using this for internationalization.  

> Compile-time parsing may eliminate the possibility of using the same 
> mechanism for internationalization, but not the possibility of using the
> same syntax.

Parsing must be done at some time.  Maybe the solution lies into finding some
way so Python could lazily delay the "compilation" of the string to after its
translation (at run-time), when it is known beforehand that a given string
is internationalised.  The `.pyc' would contain byte-code and data slot for
driving the laziness.  The translation and compilation should occur only
once for a particular string, of course, as the internationalised string
may appears within a loop, or within a function which gets called often.
In threaded contexts, if we allow for spurious re-compilations once in a
long while, and with a simple bit of care, locks could be fully avoided.[1]

The good in the above approach is that people would write Python about the
same way irrelevant to the fact internationalisation is in the picture or
not, and would not have to suffer the complexities of "hand" optimisation
of string interpolation in internationalised context.  It would simple
for _everybody_, on the road meant to make internationalisation a breeze.

For Python to know at initial compile time if a string is going to be
internationalised of not, it has to be modified, but a positive side of this
effort is that internationalisation becomes part of the language design.
A possible way towards this (suggested a long while ago) could be to use,
beside `eru', some `t' prefix letter asking for translation.

Two problems are still to be solved, however.  First, going from `_("TEXT")'
to `t"TEXT"', the translation function (`_' here) and textual domain should
have proper defaults, while offering a way to override them for bigger
applications needing finer control or tuning.  A simple solution might lie,
here, into inventing some special module attribute to that purpose.

Second, some applications accept switching national language at run-time.
So a mechanism is needed to invalidate lazily-compiled strings when such a
switch occurs.  An avenue would be to use the national language string code
as the "done" flag in the lazy compilation process, allowing recompilation
to occur on the fly, as needed.

[1] Temporarily switching locale-related environment variables in threaded
contexts may yield pretty surprising results, this is well-known already.
It only stresses, in my opinion, that the design has been frozen without
having all the vision it would have taken.  Many internationalisation devices
implement half-hearted solutions for half-thought problems.  I'm not at all
asserting that it is possible to foresee everything in advance.  Yet, we
could be more productive by _not_ slavishly sticking to actual "standards".

François Pinard   http://www.iro.umontreal.ca/~pinard