Needing help to change the grammar

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello everybody. My name is Thiago and currently I'm working as a teacher in a high school in Brazil. I have plans to offer in the school a programming course to the students, but I had some problems to find a good langüage. As a Python programmer, I really like the language's syntax and I think that Python is very good to teach programming. But there's a little problem: the commands and keywords are in english and this can be an obstacle to the teenagers that could enter in the course. Because of this, I decided to create a Python version with keywords in portuguese and with some modifications in the grammar to be more portuguese-like. To this, I'm using Python 3.0.1 source code. I already read PEP 306 (How to Change Python's Grammar) and changed the suggested files. My changes currently are working properly except for one thing: the "comp_op". The code that in english Python is written as "is not", in portuguese Python shall be "não é". Besides the translations to the words "is" and "not", I'm also changing the order in which they appear letting "not" before "is". It appears to be a simple change, but strangely, I'm not being able to perform it. I already made correct modifications in Grammar/Grammar file, the new keywords already appear in Lib/keyword.py and I also changed the function validate_comp_op in Modules/parsermodule.c: static int validate_comp_op(node *tree) { (...) else if ((res = validate_numnodes(tree, 2, "comp_op")) != 0) { res = (validate_ntype(CHILD(tree, 0), NAME) && validate_ntype(CHILD(tree, 1), NAME) && (((strcmp(STR(CHILD(tree, 0)), "não") == 0) && (strcmp(STR(CHILD(tree, 1)), "é") == 0)) || ((strcmp(STR(CHILD(tree, 0)), "não") == 0) && (strcmp(STR(CHILD(tree, 1)), "em") == 0)))); if (!res && !PyErr_Occurred()) err_string("operador de comparação desconhecido"); } return (res); } I also looked in the other files proposed in the PEP but I didn't find in them nothing that I recognized as needing changes. But when I type "make" to compile the new language, the following error appears in Lib/encodings/__init__.py (which I already translated to the portuguese Python): harry@skynet:~/Python-3.0.1$ make Fatal Python error: Py_Initialize: can't initialize sys standard streams File "/home/harry/Python-3.0.1/Lib/encodings/__init__.py", line 73 se entry não é _unknown: ^ SyntaxError: invalid syntax The comp_op doesn't work! I don't know more what to change. Perhaps there's some file that I should modify, but I didn't paid attention enough in it... Please, anybody has some idea of what should I do? Thanks a lot. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFJ3/eTmNGEzq1zP84RAh5vAJ492eVFgbR5KCCJNdTJOIR/Xtfb0ACdE0NG Yxnxmo9yjOL6H8J93nPBcJs= =6VLu -----END PGP SIGNATURE-----

On Fri, Apr 10, 2009 at 9:58 PM, Harry (Thiago Leucz Astrizi) <thiagoharry@riseup.net> wrote:
I love the idea (and most recently edited PEP 306) so here are a few suggestions; Brazil has many python programmers so you might be able to make quick progress by asking them for volunteer time. To bug-hunt your technical problem: try switching the "not is" operator to include an underscore "not_is." The python LL(1) grammar checker works for python but isn't robust, and does miss some grammar ambiguities. Making the operator a single word might reveal a bug in the parser. Please consider switching your students to 'real' python part way through the course. If they want to use the vast amount of python code on the internet as examples they will need to know the few English keywords. Also - most python core developers are not native English speakers and do OK :) PyCon speakers are about 25% non-native English speakers and EuroPython speakers are about the reverse (my rough estimate - I'd love to see some hard numbers). Keep up the Good Work, -Jack

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Written by "Martin v. Löwis" <martin@v.loewis.de>:
No, all the files in the surce code were already in UTF-8. My system is configured to treat UTF-8 as the default encoding. This is not an encoding problem. Written by "Jack diederich" <jackdied@gmail.com>:
Yes, I have plans to ask for help in the brazilian Python mailing list when I finish to prepare the C source code for this project. Then I expect to receive help to translate the python modules for this new language. There's a lot of work to do.
Thanks for the advice, you almost guessed what went wrong. I made some tests and already discovered what's the problem. When I change Grammar/Grammar, Python/ast.c and Modules/parsermodule.c to transform "is not" in "not is", everything works fine and I create a new Python verson where "a is not None" is wrong and "a not is None" is right. But when I translate this to "não é", always happens a SyntaxError. So the probles is really in the grammar checker that can't handle some letters with accent. Well, knowing where the problem is, I think that I can try to solve it by myself. Thanks again.
Yes, I know. To a more "serious" programmer, it's essential to have a basic understanding in english and would be better for him to start with the real Python. But my intent is not to substitute Python in Brazil, but to create a new language that could be learned easily by younger people for educational purposes. My intent is to show them how a computer software works. But surely I will warn my students that to take programming more seriously, it's important to learn how to program in some other language, like the original Python. But thanks for the advice. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFJ4jrjmNGEzq1zP84RAvikAJ4k25vufyWWiDvj3HFZ7Q4M38zCjgCglBGC dPQTd7mBuswKbNstpJqRuFE= =xApj -----END PGP SIGNATURE-----

Harry (Thiago Leucz Astrizi) wrote:
There are only a few modules that you really need to do this for for beginners. Trying to convert the entire stdlib, let alone other stuff on pypi, strikes me as foolish. ...
If possible, and I presume it is, make your interpreter dual language. Source code in .py files is parsed as now (and module compiles to .pyc). Source in .pyb (python-brazil) is parsed with with your new parser, and get a brazilian equivalent of builtins, but use the same AST and bytecode. Bytecode is neither English nor Brazilian ;-). This would give your students access to the whole world of Python modules and allow those who want to move to normal English-based international Python to do so without obsoleting their existing work. Terry Jan Reedy PS. Since this thread is not about developing Python itself, it would be more appropriate on the python-ideas list if continued much further. PPS Once unicode identifiers were allowed, I considered it inevitable that people would also want native-language keywords, especially for younger students. So I expected a project like yours, though I expected the first to be in Asia. I think dual language versions, if possible, would be the way to do this without ghettoizing the national versions. But as I said, a general discussion of this belongs on python-ideas.

At 16:30 -0400 04/12/2009, Terry Reedy wrote: ...
Source in .pyb (python-brazil) is parsed with with your new parser, ...
In case anyone ever does this again, I suggest that the extension be the language and optionally country code: .py_pt or .py_pt_BR -- ____________________________________________________________________ TonyN.:' <mailto:tonynelson@georgeanelson.com> ' <http://www.georgeanelson.com/>

On approximately 4/12/2009 2:41 PM, came the following characters from the keyboard of Tony Nelson:
Wouldn't that be a good idea for this implementation too? It sounds like it is not-yet-released, as it is also not-yet-bug-free. And actually, wouldn't it be nice if international keywords could be accepted as alternates if one just said import pt_BR An implementation along that line, except for things like reversing the order of "not" and "is", would allow the next national language customization to be done by just recoding the pt_BR module, renaming to pt_it or pt_fr or pt_no and translating a bunch of strings, no? Probably it would be sufficient to allow for one language at a time, per module. -- Glenn -- http://nevcal.com/ =========================== A protocol is complete when there is nothing left to remove. -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

On Fri, Apr 10, 2009 at 9:58 PM, Harry (Thiago Leucz Astrizi) <thiagoharry@riseup.net> wrote:
I love the idea (and most recently edited PEP 306) so here are a few suggestions; Brazil has many python programmers so you might be able to make quick progress by asking them for volunteer time. To bug-hunt your technical problem: try switching the "not is" operator to include an underscore "not_is." The python LL(1) grammar checker works for python but isn't robust, and does miss some grammar ambiguities. Making the operator a single word might reveal a bug in the parser. Please consider switching your students to 'real' python part way through the course. If they want to use the vast amount of python code on the internet as examples they will need to know the few English keywords. Also - most python core developers are not native English speakers and do OK :) PyCon speakers are about 25% non-native English speakers and EuroPython speakers are about the reverse (my rough estimate - I'd love to see some hard numbers). Keep up the Good Work, -Jack

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Written by "Martin v. Löwis" <martin@v.loewis.de>:
No, all the files in the surce code were already in UTF-8. My system is configured to treat UTF-8 as the default encoding. This is not an encoding problem. Written by "Jack diederich" <jackdied@gmail.com>:
Yes, I have plans to ask for help in the brazilian Python mailing list when I finish to prepare the C source code for this project. Then I expect to receive help to translate the python modules for this new language. There's a lot of work to do.
Thanks for the advice, you almost guessed what went wrong. I made some tests and already discovered what's the problem. When I change Grammar/Grammar, Python/ast.c and Modules/parsermodule.c to transform "is not" in "not is", everything works fine and I create a new Python verson where "a is not None" is wrong and "a not is None" is right. But when I translate this to "não é", always happens a SyntaxError. So the probles is really in the grammar checker that can't handle some letters with accent. Well, knowing where the problem is, I think that I can try to solve it by myself. Thanks again.
Yes, I know. To a more "serious" programmer, it's essential to have a basic understanding in english and would be better for him to start with the real Python. But my intent is not to substitute Python in Brazil, but to create a new language that could be learned easily by younger people for educational purposes. My intent is to show them how a computer software works. But surely I will warn my students that to take programming more seriously, it's important to learn how to program in some other language, like the original Python. But thanks for the advice. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFJ4jrjmNGEzq1zP84RAvikAJ4k25vufyWWiDvj3HFZ7Q4M38zCjgCglBGC dPQTd7mBuswKbNstpJqRuFE= =xApj -----END PGP SIGNATURE-----

Harry (Thiago Leucz Astrizi) wrote:
There are only a few modules that you really need to do this for for beginners. Trying to convert the entire stdlib, let alone other stuff on pypi, strikes me as foolish. ...
If possible, and I presume it is, make your interpreter dual language. Source code in .py files is parsed as now (and module compiles to .pyc). Source in .pyb (python-brazil) is parsed with with your new parser, and get a brazilian equivalent of builtins, but use the same AST and bytecode. Bytecode is neither English nor Brazilian ;-). This would give your students access to the whole world of Python modules and allow those who want to move to normal English-based international Python to do so without obsoleting their existing work. Terry Jan Reedy PS. Since this thread is not about developing Python itself, it would be more appropriate on the python-ideas list if continued much further. PPS Once unicode identifiers were allowed, I considered it inevitable that people would also want native-language keywords, especially for younger students. So I expected a project like yours, though I expected the first to be in Asia. I think dual language versions, if possible, would be the way to do this without ghettoizing the national versions. But as I said, a general discussion of this belongs on python-ideas.

At 16:30 -0400 04/12/2009, Terry Reedy wrote: ...
Source in .pyb (python-brazil) is parsed with with your new parser, ...
In case anyone ever does this again, I suggest that the extension be the language and optionally country code: .py_pt or .py_pt_BR -- ____________________________________________________________________ TonyN.:' <mailto:tonynelson@georgeanelson.com> ' <http://www.georgeanelson.com/>

On approximately 4/12/2009 2:41 PM, came the following characters from the keyboard of Tony Nelson:
Wouldn't that be a good idea for this implementation too? It sounds like it is not-yet-released, as it is also not-yet-bug-free. And actually, wouldn't it be nice if international keywords could be accepted as alternates if one just said import pt_BR An implementation along that line, except for things like reversing the order of "not" and "is", would allow the next national language customization to be done by just recoding the pt_BR module, renaming to pt_it or pt_fr or pt_no and translating a bunch of strings, no? Probably it would be sufficient to allow for one language at a time, per module. -- Glenn -- http://nevcal.com/ =========================== A protocol is complete when there is nothing left to remove. -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking
participants (6)
-
"Martin v. Löwis"
-
Glenn Linderman
-
Harry (Thiago Leucz Astrizi)
-
Jack diederich
-
Terry Reedy
-
Tony Nelson