[Python-ideas] Needing help to change the grammar

Sat Apr 18 22:03:17 CEST 2009

spir wrote:
> Le Sat, 18 Apr 2009 23:52:49 +1000,
> Nick Coghlan <ncoghlan at gmail.com> s'exprima ainsi:
> 
>> (moving discussion to Python Ideas)
>>
>> (Context for py-ideas: a teacher in Brazil is working on a Python
>> language variant that uses Portuguese rather than English-based
>> keywords. This is intended for use in teaching introductory programming
>> lessons, not as a professional development tool)
>>
>> Glenn Linderman wrote:
>>> import pt_BR
>>>
>>> An implementation along that line, except for things like reversing the
>>> order of "not" and "is", would allow the next national language
>>> customization to be done by just recoding the pt_BR module, renaming to
>>> pt_it or pt_fr or pt_no and translating a bunch of strings, no?
>>>
>>> Probably it would be sufficient to allow for one language at a time, per
>>> module.
>> Making that work would actually require something like the file encoding
>> cookie that is detected at the parsing stage. Otherwise the parser and
>> compiler would choke on the unexpected keywords long before the
>> interpreter reached the stage of attempting to import anything.

My original proposal in response to the OP was that language be encoded 
in the extension: pybr, for instance.  That would be noticed before 
reading the file.  Cached modules would still be standard .pyc, 
interoperable with .pyc compiled from normal Python.  I am presuming 
this would work on all systems.

>> Adjusting the parser to accept different keyword names would be even
>> more difficult though, since changing the details of the grammar
>> definition is a lot more invasive than just changing the encoding of the
>> file being read.
> 
>> Cheers,
>> Nick.
>>
> 
> Maybe I don't really understand the problem, or am overlooking obvious issues. If the question is only to have a national language variant of python, there are certainly numerous easier methods than tweaking the parser to make it flexible enough to be natural language-aware.
> 
> Why not simply have a preprocessing func that translates back to standard/english python using a simple dict? For practicle everyday work, this may done by:
> * assigning a special extension (eg .pybr) to the 'special' source code files,
> * associating this extension to the preprocessing program...
> * that would pass the back-translated .py source to python.

The OP was proposing to change 'is not' to the equivalent of 'not is'. 
I am not sure of how critical that would actually be.  For the purpose 
of easing transition to international Python, not messing with statement 
word order would be a plus.

>  [A more general solution would be to introduce a customization layer/interface in a python-aware editor. Sources would always been stored in standard format. At load-time, they would be translated according to a currently active config, that, indeed, would only affect developper input-output (the principle is thus analog to syntax-highlighting).
> * Any developper can edit any source according to his/her own preferences.
> * Python does not need to care about that.
> * Customization can be lexical (keywords, builtins, signs) but also touch a certain amount of syntax.
> The issue here is that the editor parser (for syntax highlighting and numerous nice features) has to be made flexible enough to cope with this customization.]

This might be easier than changing the interpreter.  The extension could 
just as be be read and written by an editor.  The problem is the 
multiple editors.

The reason I susggested some support in the core for nationalization is 
that I think a) it is inevitable, in spite of the associated problem of 
ghettoization, while b) ghettoization should be discourage and can be 
ameliorated with a bit of core support.  I am aware, of course, that 
such support, by removing one barrier to nationalization, will 
accelerate the development of such versions.

Terry Jan Reedy