
I find myself thinking about PEP8 a lot, not that I have it memorized. Now that Unicode reigns at the top-level, we've got an influx of Chinese namespaces, Hindi namespaces, Cyrillic namespaces... a nice long list, and the PEP8 conventions regarding capitalization, while sensible in Latin-1, might not cover the new cases (I say "might not" with some sarcasm, or an innocent stare (playing it straight)). I've seen arguments in diversity-minded circles that straying from Latin-1 top-level will obliterate the open source nature of open source, with many a Chinese engineer welcoming the advantages of a world around simple base cases, the old ASCII, a mother tongue of computer scientists (more so than EBCDIC even (sarcasm again)). http://en.wikipedia.org/wiki/Extended_Binary_Coded_Decimal_Interchange_Code The inter-readability of Latin-1 means lots of headaches removed, like at least *something* positive came out of that Roman period (as a child of Rome, I get to sound chiding). The flip side argument, which I find more persuasive, is that one of the biggest barriers to diversity is over-reliance on Latin-1, and "just ASCII" in particular. I heard those cheers in Vilnius, when Guido talked about the brave new Unicode world. Google's blogger interface switches to Lithuanian automatically when that's your timezone, or however it's figured. Lots of alphabetical markings you might not find in Latin-1. http://controlroom.blogspot.com/2007/07/blogger-control-panel-in-lithuanian.... The whole point of Unicode was to open up source code writing, as an occupation, to more than just Euro-English speakers. The bridge has been built and Python has already crossed over it. http://controlroom.blogspot.com/2007/11/unicode.html None of which is to say that knowledge of Latin-1 is dispensable. My first chapters in Naming and Ordering per MathFuture threads (also Cardinality vs Ordinality) starts with "mappings" (the usual approach to functions per Dolciani) with familiar glyphs (we're learning them anyway in learning to read a native language), pairing with ASCII and Unicode bytecodes. Yes, it's a long discussion (UTF8 vs UTF32 etc) but we're talking about time slices and repeated revisits in a spiraling trajectory (per Saxon treatments). So even if the bulk of your coding is in some Thai characterset, you're quite familiar with the lower 128 in the Unicode codespace. Python itself has 33 keywords and a large number of builtins, such that "average Python" might look like Romanji-intensive Japanese, i.e. "heavy on the Latin-1 pepper, other spices" (yet lots of room for top-level class, function, variable names, libraries stuffed with them, all outside Latin-1). These concerns have been a long term focus, and continue to be, as Python students I encounter may be there for work and that may mean using non-Latin-1 Python namespaces much of the time. Kirby

On Sun, July 17, 2011 1:46 pm, kirby urner wrote:
I find myself thinking about PEP8 a lot, not that I have it memorized.
Now that Unicode reigns at the top-level, we've got an influx of Chinese namespaces, Hindi namespaces, Cyrillic namespaces... a nice long list, and the PEP8 conventions regarding capitalization, while sensible in Latin-1, might not cover the new cases (I say "might not" with some sarcasm, or an innocent stare (playing it straight)).
I got involved in the original RFC for Unicode URLs and more general URIs when the discussion was mainly, "We have to! So many people desperately need it!" "We can't, it'll break the Web!" We got it worked out. I am pleased to see Unicode country name TLDs proliferating, too. I was one of those who needed it. I was doing silly things like editing an APL magazine and converting a Chinese-Korean-Japanese-European Go glossary from ASCII to Unicode.
I've seen arguments in diversity-minded circles that straying from Latin-1 top-level will obliterate the open source nature of open source, with many a Chinese engineer welcoming the advantages of a world around simple base cases, the old ASCII, a mother tongue of computer scientists (more so than EBCDIC even (sarcasm again)).
http://en.wikipedia.org/wiki/Extended_Binary_Coded_Decimal_Interchange_Code
I remember a long time ago reading about Russian COBOL, combining Latin and Cyrillic in EBCDIC. Fun times. I have had the personal misfortune of attempting to explain to a Japanese programmer why it was the Japanese character set definition that replaced backslash with the Yen sign that was broken, not Unicode. They couldn't just do a search and replace in Windows code where that code in a text string usually meant Yen, but in code meant the Windows directory separator in path expressions. So they blamed Western cultural imperialists and didn't listen when we explained how many Japanese experts were involved in the Japanese character set mappings.
The inter-readability of Latin-1 means lots of headaches removed, like at least *something* positive came out of that Roman period (as a child of Rome, I get to sound chiding).
The flip side argument, which I find more persuasive, is that one of the biggest barriers to diversity is over-reliance on Latin-1, and "just ASCII" in particular.
The char = 8-bit or even 7-bit byte delusion, with hundreds of incompatible 8-bit character sets and fonts. I have had a few things to say about that.
I heard those cheers in Vilnius, when Guido talked about the brave new Unicode world. Google's blogger interface switches to Lithuanian automatically when that's your timezone, or however it's figured. Lots of alphabetical markings you might not find in Latin-1.
http://controlroom.blogspot.com/2007/07/blogger-control-panel-in-lithuanian....
The whole point of Unicode was to open up source code writing, as an occupation, to more than just Euro-English speakers.
Much more than source code. Are you becoming the man who pounds nails all day, to whom everything looks like a hammer? ^_^
The bridge has been built and Python has already crossed over it.
Which means that various parts of Sugar, including Turtle Art, have also crossed over.
I was at the Unicode Conference where Jim Brown of IBM told that part of the world that APL2 was fully Unicode-capable for data and identifiers. There was also a proposal to put Unicode math into programming languages.
None of which is to say that knowledge of Latin-1 is dispensable.
One of the points I had to make in that IETF discussion was that every Japanese schoolchild was already learning to keyboard in romaji in addition to kana, with kana-to-kanji conversion. http://lists.w3.org/Archives/Public/uri/1997Apr/0109.html
My first chapters in Naming and Ordering per MathFuture threads (also Cardinality vs Ordinality) starts with "mappings" (the usual approach to functions per Dolciani) with familiar glyphs (we're learning them anyway in learning to read a native language), pairing with ASCII and Unicode bytecodes.
I have just been working on a tutorial, not yet completed, on ancient visual numerals now available in Unicode, including Egyptian Hieroglyphic heqat measure symbols, which I am putting on Turtle Art variable name tiles. (For our Egyptophiles, Egyptian fraction analysis was done with the heqat as the unit.) Next I plan to teach the turtle how to write heqat, cuneiform, and Counting Rod numerals, and then base-20 Mayan, which unfortunately is not in Unicode yet. There are fonts using the Private Use Area, however. I am also considering making a bunch of pie-slice fraction numerals for my planned tutorial on fractions using Turtle Art. http://wiki.sugarlabs.org/go/Activities/TurtleArt/Tutorials/Numerals I have several of these tutorials in a reasonably finished state, and many more written but not illustrated, with more outlined. Cardinals and mappings are in the Counting tutorial, but I have not tackled ordinals yet.
Yes, it's a long discussion (UTF8 vs UTF32 etc) but we're talking about time slices and repeated revisits in a spiraling trajectory (per Saxon treatments).
So even if the bulk of your coding is in some Thai characterset, you're quite familiar with the lower 128 in the Unicode codespace. Python itself has 33 keywords and a large number of builtins, such that "average Python" might look like Romaji-intensive Japanese, i.e. "heavy on the Latin-1 pepper, other spices" (yet lots of room for top-level class, function, variable names, libraries stuffed with them, all outside Latin-1).
These concerns have been a long term focus, and continue to be, as Python students I encounter may be there for work and that may mean using non-Latin-1 Python namespaces much of the time.
Kirby _______________________________________________ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
-- Edward Mokurai (默雷/धर्ममेघशब्दगर्ज/دھرممیگھشبدگر ج) Cherlin Silent Thunder is my name, and Children are my nation. The Cosmos is my dwelling place, the Truth my destination. http://wiki.sugarlabs.org/go/Replacing_Textbooks

http://wiki.sugarlabs.org/go/Activities/TurtleArt/Tutorials/Numerals
I have several of these tutorials in a reasonably finished state, and many more written but not illustrated, with more outlined. Cardinals and mappings are in the Counting tutorial, but I have not tackled ordinals yet.
Fascinating stuff sir, lots I didn't know. Python has syntactical meaning for the backslash as well ( \ ) -- line continuation character. So even in Python in Japanese, you would need \ to stay the same. One doesn't find a lot of examples, at least not easily, of non-Latin-1 Python programs (source code). I'm on the lookout for exhibits, collecting URIs. Kirby
participants (3)
-
kirby urner
-
Kirby Urner
-
mokurai@earthtreasury.org