
From: kirby urner <kirby.urner@gmail.com> To: edu-sig@python.org
I find myself thinking about PEP8 a lot, not that I have it memorized.
Now that Unicode reigns at the top-level, we've got an influx of Chinese namespaces, Hindi namespaces, Cyrillic namespaces... a nice long list, and the PEP8 conventions regarding capitalization, while sensible in Latin-1, might not cover the new cases [...]
Remember that PEP 8 is a guideline, not a requirement, and that it is documented as: "This document gives coding conventions for the Python code comprising the standard library in the main Python distribution. " Any programmer may choose any style she prefers for her own use. If she is writing modules for the standard library -- and expects to have them accepted by the community -- then she will follow PEP 8.
The inter-readability of Latin-1 means lots of headaches removed, like at least *something* positive came out of that Roman period [...]
Yes, having taken a contract to maintain code where the author wrote his comments in Romanized Ukrainian, I have a great respect for non-English-native authors who take the time to learn English well. It is a terrible language to have to learn, I am told, but is the only one _all_ software engineers can be expected to know. Also PEP 8 states that: "Latin-1 (or UTF-8) should only be used when a comment or docstring needs to mention an author name that requires Latin-1; otherwise, using \x, \u or \U escapes is the preferred way to include non-ASCII data in string literals." Even then, I would hope that an author would include an Anglicized version of his name, so that I can recognize it when I see it again. The only alphabets I personally can read are Latin, Cyrillic, Hebrew, and Greek. If your name is in Cherokee, then please put (John Standing Bear) in ASCII along side it. There is a very good reason for this: standard library code must be readable for people all over the world. That's why a Dutch software engineer wrote a language in which all the keywords and commentary are in English.
The flip side argument, which I find more persuasive, is that one of the biggest barriers to diversity is over-reliance on Latin-1, and "just ASCII" in particular.
The whole point of Unicode was to open up source code writing, as an occupation, to more than just Euro-English speakers.
I disagree. The whole point of Unicode is to open up application writing, so that _users_ can see computer output in their own languages. A person who wishes to pursue code writing as an occupation must understand and use English -- or be relegated to producing work only for his own culture. In the modern "flat" world, English is the language of commerce and computer programming. Not being able to write understandable English is a severe handicap. My programs are written in Python, documented in English, and usable by persons of another language. For example, see CaesarCalc.py from https://launchpad.net/romanclass , which assumes the user to be able to understand pigeon Latin. Even then, I give the result of (XVI - XVI) as "Nulla" because I expect that most users will not recognize "Nvlla" as meaning "nothing." Here is sample output. Notice that, when it blows up the traceback is in Python with English explanations: <console dump> procer numerus hic:III - II I procer numerus hic:3 - 2 I procer numerus hic:3 - 3 Nulla procer numerus hic:2 - 3 Traceback (most recent call last): File "CaesarCalc.py", line 40, in <module> print (cvt(subtrahends[0]) - cvt(subtrahends[1])) File "/home/vernon/romanclass-1.0.1/romanclass.py", line 99, in __sub__ return Roman(self.__int__() - other) File "/home/vernon/romanclass-1.0.1/romanclass.py", line 85, in __new__ raise OutOfRangeError, 'Cannot store "%s" as Roman' % repr(N) romanclass.OutOfRangeError: Cannot store "-1" as Roman </console dump> IMHO, on the whole, PEP 8 is a pretty good document. -- Vernon

Hi Vernon, ... not to be confused with Vern "the Watcher" Ceder. On Mon, Jul 18, 2011 at 8:47 AM, Vernon Cole <vernondcole@gmail.com> wrote:
There is a very good reason for this: standard library code must be readable for people all over the world. That's why a Dutch software engineer wrote a language in which all the keywords and commentary are in English.
Yes, the Standard Library is to be Anglicized for some time to come, maybe always, per Guido's talks. Of course there's nothing to stop someone from writing a translator for the Standard Library, such that the source originals (as modified) might be rendered in myriad other charactersets. Top-level names tend to be amenable to such treatment. This may be done down to the C family level, though I'm not suggesting that it should be (nor are all Python implementations C family I hasten to add, (a Jython is "C family" if the Java VM is)). The same is not true for 3rd party modules which, as you say, may be written in any style. Learning the Latin (English) alphabet, building a vocabulary, remains a good idea obviously, along with ASCII in the context of Unicode. I expect those focused in computer science will continue giving themselves the benefit of this learning. I received Romanized Indonesian source code for quite awhile, until the student moved to Japan and apparently stopped doing Python. I'm impressed with all the alphabets you know. 3rd party modules written in Cyrillic with the peppering of Roman we know must be there, thanks to Standard Library (untranslated) and the 33 keywords (so far), could be used in computer science to help English speakers learn a Cyrillic language. http://en.wikipedia.org/wiki/Languages_written_in_a_Cyrillic-derived_alphabe...
The flip side argument, which I find more persuasive, is that one of the biggest barriers to diversity is over-reliance on Latin-1, and "just ASCII" in particular.
The whole point of Unicode was to open up source code writing, as an occupation, to more than just Euro-English speakers.
I disagree. The whole point of Unicode is to open up application writing, so that _users_ can see computer output in their own languages. A person who wishes to pursue code writing as an occupation must understand and use English -- or be relegated to producing work only for his own culture. In the modern "flat" world, English is the language of commerce and computer programming. Not being able to write understandable English is a severe handicap. My programs are written in Python, documented in English, and usable by persons of another language. For example, see CaesarCalc.py from https://launchpad.net/romanclass , which assumes the user to be able to understand pigeon Latin. Even then, I give the result of (XVI - XVI) as "Nulla" because I expect that most users will not recognize "Nvlla" as meaning "nothing."
Certainly the GUI needs to be intelligible yes. Lets just say there's a school of thought that has no problem with a math, logic or grammar teacher using only Chinese characters for top level names in various exercises using Python or other Unicode aware computer language. And no problem with another teacher using only Hebrew characters for top level names and so on. This school of though hangs out on the Python Diversity list and self-organizes there. If you go back in the archives, you'll find myself and a guy named Carl doing stuff in the Python wiki to expand the language base, including at the source code level. With Pycon / Tehran in the planning, we want to be in a better position to address issues relating GeoDjango to Farsi, say. These exercises (mentioned above) may have nothing to do with writing commercial applications. These may not be programmers in training (though some may be in commercial media, where "programming" also has meaning (e.g. in radio / TV)). Instead of using a calculator or abacus to learn numeracy skills, people have laptops and internet access. Having readable source code in languages that aren't in a Roman alphabet is already a spreading phenomenon, with many writers happily giving up that so-called "world readability" in favor of remaining intelligible to the girl or boy next door. The syntax of URIs and domain names has already taken this turn. You will have http//arabic letters// quite frequently these days, thanks to the Unicode basis of http (which Python now needs to deal with, and does, as an http-aware language). CSS for Arabic is the kind of style concern for which we may have insufficient literature to date. We may have people joining Diversity who want to develop that literature (recruiting happening). http://www.guardian.co.uk/technology/2010/may/06/arabic-web-addresses-intern... Here is sample output. Notice that, when it blows up the traceback is in
Python with English explanations: <console dump> procer numerus hic:III - II I procer numerus hic:3 - 2 I procer numerus hic:3 - 3 Nulla procer numerus hic:2 - 3 Traceback (most recent call last): File "CaesarCalc.py", line 40, in <module> print (cvt(subtrahends[0]) - cvt(subtrahends[1])) File "/home/vernon/romanclass-1.0.1/romanclass.py", line 99, in __sub__ return Roman(self.__int__() - other) File "/home/vernon/romanclass-1.0.1/romanclass.py", line 85, in __new__ raise OutOfRangeError, 'Cannot store "%s" as Roman' % repr(N) romanclass.OutOfRangeError: Cannot store "-1" as Roman </console dump>
IMHO, on the whole, PEP 8 is a pretty good document. -- Vernon
I'm not denigrating PEP8 in any way, even though I used some light sarcasm in my post. That was not directed against PEP8, so much as against the idea that the "rule book" is somehow complete, just because we have it down that functions should generally not start with a capital letter, and l (lowercase L) is a terrible name for all purposes because it's so indistinguishable from uppercase I and the number 1 in many fonts. I think as people start getting a lot more experience writing Python with different namespaces, with non-Roman top-level names etc., that the rule book is inevitably going to expand and that a Book of Styles could conceivably become enormous. But then think of English: we acknowledge many styles as being appropriate and don't have just the one "book" where style is concerned (we have so many) -- not like the dictionary, with a goal of including every word in a finite and deliberately exclusive set of standard words. I have some examples of Python source in my blogs, using kanji as top-level names (might be a Japanese program, as one of the kanji is for Mt. Fuji as I recall). Then there's some tracking down Stallman on a visit to Sri Lanka (awhile back) and chatter about Python in Tamil and Sinhalese. And yes, I am aware English is spoken in this parts as well, as evidenced by Arthur C. Clarke's having lived there for so long. One of our CSN chiefs has a track record there too, another English speaker. http://www.sarvodaya.org/2005/05/17/suzanne-bader%E2%80%99s-sri-lanka-visit-... http://controlroom.blogspot.com/2009/01/at-work.html http://risenfall.wordpress.com/2008/01/14/richard-stallman-rms-is-in-sri-lan... Kirby

Since Kirby invoked me by name ;), I'll jump in with a quick top post, a) because I'm lazy and in a hurry, and b) because my comments are only generally related to the specifics of the previous posts. Apologies. First of all, in general I would respond to Kirby's musings by invoking my own personal principle that PEP 20 (specifically, "Practicality beats purity") trumps PEP 8. I would think that would be true when it comes to naming conventions in other languages. If it's a library that is useful to, say, Klingon speakers only, it would make sense to name the library and it's components in Klingon. OTOH, if they wanted to share their work and wanted it to be useful to the non-Klingon speaking Federation, Klingon might not be a practical or effective choice. (Of course, being Klingon, they may not care... ;) ) Personally, I run into this issue on a daily basis these days. As the current maintainer of an entire web platform developed by our Japanese sister company, I face emails and documentation (including code comments) in Japanese, giving me ample practice with both Google translate and deciphering kana. When I chatted with the Japanese team (via Google translate, gestures, and scrawling code on the whiteboard) about the new features of Python 3, support for unicode in Python code got a cheer, and I certainly understand that. However, for sharing code, I'd have to agree with my namesake - diverging from the English standard is problematic. I'm finding what I think of as "technical Japanese" to be not that hard to understand, but that's exactly because so much of the vocabulary is borrowed from English - data, account, server, etc, etc, etc. Finally, I have to note that both of us Vernons are conversant in Latin, which is the sort of coincidence sportscasters are prone to mis-label "ironic"... ;) Cheers, Vern On Mon, Jul 18, 2011 at 1:29 PM, kirby urner <kirby.urner@gmail.com> wrote:
Hi Vernon,
... not to be confused with Vern "the Watcher" Ceder.
On Mon, Jul 18, 2011 at 8:47 AM, Vernon Cole <vernondcole@gmail.com>wrote:
There is a very good reason for this: standard library code must be readable for people all over the world. That's why a Dutch software engineer wrote a language in which all the keywords and commentary are in English.
Yes, the Standard Library is to be Anglicized for some time to come, maybe always, per Guido's talks.
Of course there's nothing to stop someone from writing a translator for the Standard Library, such that the source originals (as modified) might be rendered in myriad other charactersets.
Top-level names tend to be amenable to such treatment.
This may be done down to the C family level, though I'm not suggesting that it should be (nor are all Python implementations C family I hasten to add, (a Jython is "C family" if the Java VM is)).
The same is not true for 3rd party modules which, as you say, may be written in any style.
Learning the Latin (English) alphabet, building a vocabulary, remains a good idea obviously, along with ASCII in the context of Unicode.
I expect those focused in computer science will continue giving themselves the benefit of this learning.
I received Romanized Indonesian source code for quite awhile, until the student moved to Japan and apparently stopped doing Python.
I'm impressed with all the alphabets you know.
3rd party modules written in Cyrillic with the peppering of Roman we know must be there, thanks to Standard Library (untranslated) and the 33 keywords (so far), could be used in computer science to help English speakers learn a Cyrillic language.
http://en.wikipedia.org/wiki/Languages_written_in_a_Cyrillic-derived_alphabe...
The flip side argument, which I find more persuasive, is that one of the biggest barriers to diversity is over-reliance on Latin-1, and "just ASCII" in particular.
The whole point of Unicode was to open up source code writing, as an occupation, to more than just Euro-English speakers.
I disagree. The whole point of Unicode is to open up application writing, so that _users_ can see computer output in their own languages. A person who wishes to pursue code writing as an occupation must understand and use English -- or be relegated to producing work only for his own culture. In the modern "flat" world, English is the language of commerce and computer programming. Not being able to write understandable English is a severe handicap. My programs are written in Python, documented in English, and usable by persons of another language. For example, see CaesarCalc.py from https://launchpad.net/romanclass , which assumes the user to be able to understand pigeon Latin. Even then, I give the result of (XVI - XVI) as "Nulla" because I expect that most users will not recognize "Nvlla" as meaning "nothing."
Certainly the GUI needs to be intelligible yes.
Lets just say there's a school of thought that has no problem with a math, logic or grammar teacher using only Chinese characters for top level names in various exercises using Python or other Unicode aware computer language. And no problem with another teacher using only Hebrew characters for top level names and so on.
This school of though hangs out on the Python Diversity list and self-organizes there. If you go back in the archives, you'll find myself and a guy named Carl doing stuff in the Python wiki to expand the language base, including at the source code level. With Pycon / Tehran in the planning, we want to be in a better position to address issues relating GeoDjango to Farsi, say.
These exercises (mentioned above) may have nothing to do with writing commercial applications. These may not be programmers in training (though some may be in commercial media, where "programming" also has meaning (e.g. in radio / TV)). Instead of using a calculator or abacus to learn numeracy skills, people have laptops and internet access.
Having readable source code in languages that aren't in a Roman alphabet is already a spreading phenomenon, with many writers happily giving up that so-called "world readability" in favor of remaining intelligible to the girl or boy next door.
The syntax of URIs and domain names has already taken this turn. You will have http//arabic letters// quite frequently these days, thanks to the Unicode basis of http (which Python now needs to deal with, and does, as an http-aware language).
CSS for Arabic is the kind of style concern for which we may have insufficient literature to date. We may have people joining Diversity who want to develop that literature (recruiting happening).
http://www.guardian.co.uk/technology/2010/may/06/arabic-web-addresses-intern...
Here is sample output. Notice that, when it blows up the traceback is in
Python with English explanations: <console dump> procer numerus hic:III - II I procer numerus hic:3 - 2 I procer numerus hic:3 - 3 Nulla procer numerus hic:2 - 3 Traceback (most recent call last): File "CaesarCalc.py", line 40, in <module> print (cvt(subtrahends[0]) - cvt(subtrahends[1])) File "/home/vernon/romanclass-1.0.1/romanclass.py", line 99, in __sub__ return Roman(self.__int__() - other) File "/home/vernon/romanclass-1.0.1/romanclass.py", line 85, in __new__ raise OutOfRangeError, 'Cannot store "%s" as Roman' % repr(N) romanclass.OutOfRangeError: Cannot store "-1" as Roman </console dump>
IMHO, on the whole, PEP 8 is a pretty good document. -- Vernon
I'm not denigrating PEP8 in any way, even though I used some light sarcasm in my post. That was not directed against PEP8, so much as against the idea that the "rule book" is somehow complete, just because we have it down that functions should generally not start with a capital letter, and l (lowercase L) is a terrible name for all purposes because it's so indistinguishable from uppercase I and the number 1 in many fonts.
I think as people start getting a lot more experience writing Python with different namespaces, with non-Roman top-level names etc., that the rule book is inevitably going to expand and that a Book of Styles could conceivably become enormous.
But then think of English: we acknowledge many styles as being appropriate and don't have just the one "book" where style is concerned (we have so many) -- not like the dictionary, with a goal of including every word in a finite and deliberately exclusive set of standard words.
I have some examples of Python source in my blogs, using kanji as top-level names (might be a Japanese program, as one of the kanji is for Mt. Fuji as I recall).
Then there's some tracking down Stallman on a visit to Sri Lanka (awhile back) and chatter about Python in Tamil and Sinhalese. And yes, I am aware English is spoken in this parts as well, as evidenced by Arthur C. Clarke's having lived there for so long. One of our CSN chiefs has a track record there too, another English speaker.
http://www.sarvodaya.org/2005/05/17/suzanne-bader%E2%80%99s-sri-lanka-visit-... http://controlroom.blogspot.com/2009/01/at-work.html
http://risenfall.wordpress.com/2008/01/14/richard-stallman-rms-is-in-sri-lan...
Kirby
_______________________________________________ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
-- Vern Ceder vceder@gmail.com, vceder@dogsinmotion.com The Quick Python Book, 2nd Ed - http://bit.ly/bRsWDW

On Mon, Jul 18, 2011 at 1:00 PM, Vern Ceder <vceder@gmail.com> wrote:
Since Kirby invoked me by name ;), I'll jump in with a quick top post, a) because I'm lazy and in a hurry, and b) because my comments are only generally related to the specifics of the previous posts. Apologies.
First of all, in general I would respond to Kirby's musings by invoking my own personal principle that PEP 20 (specifically, "Practicality beats purity") trumps PEP 8. I would think that would be true when it comes to naming conventions in other languages. If it's a library that is useful to, say, Klingon speakers only, it would make sense to name the library and it's components in Klingon. OTOH, if they wanted to share their work and wanted it to be useful to the non-Klingon speaking Federation, Klingon might not be a practical or effective choice. (Of course, being Klingon, they may not care... ;) )
I'm glad we're having this thread as it relates to my work concerns also, where many of my students, with whom I connect asynchronously, disclose to me their difficulties with English. Struggling with translators and so on is a fact of life, but the time will come when hospital room LCDs illuminate with familiar glyphs, including simple things like the light switch, bed controls. Pictures and videos from home will fill the hospital room's picture frames, with pointers embedded right in the medical record (managed like a profile, by the patients themselves). You'll be in a "language bubble" where your caretakers have spared no effort to have you not focused on phrase book deciphering. This is your body we're talking about. It's ridiculous that you should have to learn an alien tongue to follow the action. The menu in the dining room will be in your language. At least some of your fellow passengers will share your language also. This might be a geek cruise, for people who code in Perl, mostly using Greek. I'm not saying every hospital room will be this advanced, and perhaps not the ones Anglophones manage, as they tend to take that trademark "others should learn English" approach that so characterized the 113 years of Anglo-British rule. Russian maybe, hospital cruise ships, with some US health plans providing access, but most too far behind the times. Hotel management science is pioneering in these same directions. Universities may bring up the rear, I don't know.
Personally, I run into this issue on a daily basis these days. As the current maintainer of an entire web platform developed by our Japanese sister company, I face emails and documentation (including code comments) in Japanese, giving me ample practice with both Google translate and deciphering kana. When I chatted with the Japanese team (via Google translate, gestures, and scrawling code on the whiteboard) about the new features of Python 3, support for unicode in Python code got a cheer, and I certainly understand that.
However, for sharing code, I'd have to agree with my namesake - diverging from the English standard is problematic. I'm finding what I think of as "technical Japanese" to be not that hard to understand, but that's exactly because so much of the vocabulary is borrowed from English - data, account, server, etc, etc, etc.
It's problematic, but that's not going to stop it from happening in various communities. A population of a few hundred thousand might easily support a bevy of open source solutions that are encoded in the Klingon of that realm. Think of a class definition with one Chinese ideograph for a name, and most methods at most two characters. The dot notation is still there, as are the calling parens. The keywords, __init__, __repr__ -- all pretty familiar. Use a translator then? Of course the word of "self" might actually be replaced, with the symbol for "used to be known as Prince" maybe (joke): http://worldgame.blogspot.com/2007/11/pythonic-biology.html http://mybizmo.blogspot.com/2007/03/add-multiply-or.html There's this dream of English always being some "lingua franca" (joke) of the Open Source world, but per recent PSF member threads (me a threader), not everyone dreams the same dream. At one of the recent OSCONs, maybe five years ago, we had a panel on Open Source in Africa. The message from that corner was the open source tools were being remastered, with an eye to reinventing many wheels from scratch. http://www.w3.org/International/articles/css3-text/ http://www.jeemlang.com/ http://www.jeemlang.com/index.php?page=examples Finally, I have to note that both of us Vernons are conversant in Latin,
which is the sort of coincidence sportscasters are prone to mis-label "ironic"... ;)
I'm not bad with Latin cognates, having grown up watching Italian TV and movies (lived in Rome), studied French and Spanish. I'm also aware of the importance of English as a supra-national language, in the Philippine Islands for example (my high school home), where so many small user groups use it to get along at meetups. English itself is always morphing. Some have argued for the existence of a language called American (pronounce amer-IKAN, like puerto-RICAN) which goes even further towards accommodating its non-Anglo users. Gene Fowler (poet) called it Amerish (same idea). Kirby
participants (3)
-
kirby urner
-
Vern Ceder
-
Vernon Cole