[Edu-sig] Musings on PEP8
mokurai at earthtreasury.org
mokurai at earthtreasury.org
Tue Jul 19 08:25:57 CEST 2011
On Sun, July 17, 2011 1:46 pm, kirby urner wrote:
> I find myself thinking about PEP8 a lot, not that I have it memorized.
> Now that Unicode reigns at the top-level, we've got an influx of
> Chinese namespaces, Hindi namespaces, Cyrillic namespaces...
> a nice long list, and the PEP8 conventions regarding capitalization,
> while sensible in Latin-1, might not cover the new cases (I say
> "might not" with some sarcasm, or an innocent stare (playing
> it straight)).
I got involved in the original RFC for Unicode URLs and more general URIs
when the discussion was mainly, "We have to! So many people desperately
need it!" "We can't, it'll break the Web!" We got it worked out. I am
pleased to see Unicode country name TLDs proliferating, too.
I was one of those who needed it. I was doing silly things like editing an
APL magazine and converting a Chinese-Korean-Japanese-European Go glossary
from ASCII to Unicode.
> I've seen arguments in diversity-minded circles that straying
> from Latin-1 top-level will obliterate the open source nature of
> open source, with many a Chinese engineer welcoming the
> advantages of a world around simple base cases, the old
> ASCII, a mother tongue of computer scientists (more so than
> EBCDIC even (sarcasm again)).
I remember a long time ago reading about Russian COBOL, combining Latin
and Cyrillic in EBCDIC. Fun times.
I have had the personal misfortune of attempting to explain to a Japanese
programmer why it was the Japanese character set definition that replaced
backslash with the Yen sign that was broken, not Unicode. They couldn't
just do a search and replace in Windows code where that code in a text
string usually meant Yen, but in code meant the Windows directory
separator in path expressions. So they blamed Western cultural
imperialists and didn't listen when we explained how many Japanese experts
were involved in the Japanese character set mappings.
> The inter-readability of Latin-1 means lots of headaches removed,
> like at least *something* positive came out of that Roman period
> (as a child of Rome, I get to sound chiding).
> The flip side argument, which I find more persuasive, is that
> one of the biggest barriers to diversity is over-reliance on Latin-1,
> and "just ASCII" in particular.
The char = 8-bit or even 7-bit byte delusion, with hundreds of
incompatible 8-bit character sets and fonts. I have had a few things to
say about that.
> I heard those cheers in Vilnius, when Guido talked about the brave
> new Unicode world. Google's blogger interface switches to
> Lithuanian automatically when that's your timezone, or however
> it's figured. Lots of alphabetical markings you might not find in
> The whole point of Unicode was to open up source code writing,
> as an occupation, to more than just Euro-English speakers.
Much more than source code. Are you becoming the man who pounds nails all
day, to whom everything looks like a hammer? ^_^
> The bridge has been built and Python has already crossed
> over it.
Which means that various parts of Sugar, including Turtle Art, have also
I was at the Unicode Conference where Jim Brown of IBM told that part of
the world that APL2 was fully Unicode-capable for data and identifiers.
There was also a proposal to put Unicode math into programming languages.
> None of which is to say that knowledge of Latin-1 is dispensable.
One of the points I had to make in that IETF discussion was that every
Japanese schoolchild was already learning to keyboard in romaji in
addition to kana, with kana-to-kanji conversion.
> My first chapters in Naming and Ordering per MathFuture threads
> (also Cardinality vs Ordinality) starts with "mappings" (the usual
> approach to functions per Dolciani) with familiar glyphs (we're
> learning them anyway in learning to read a native language),
> pairing with ASCII and Unicode bytecodes.
I have just been working on a tutorial, not yet completed, on ancient
visual numerals now available in Unicode, including Egyptian Hieroglyphic
heqat measure symbols, which I am putting on Turtle Art variable name
tiles. (For our Egyptophiles, Egyptian fraction analysis was done with the
heqat as the unit.) Next I plan to teach the turtle how to write heqat,
cuneiform, and Counting Rod numerals, and then base-20 Mayan, which
unfortunately is not in Unicode yet. There are fonts using the Private Use
Area, however. I am also considering making a bunch of pie-slice fraction
numerals for my planned tutorial on fractions using Turtle Art.
I have several of these tutorials in a reasonably finished state, and many
more written but not illustrated, with more outlined. Cardinals and
mappings are in the Counting tutorial, but I have not tackled ordinals
> Yes, it's a long discussion (UTF8 vs UTF32 etc) but we're
> talking about time slices and repeated revisits in a spiraling
> trajectory (per Saxon treatments).
> So even if the bulk of your coding is in some Thai characterset,
> you're quite familiar with the lower 128 in the Unicode codespace.
> Python itself has 33 keywords and a large number of builtins,
> such that "average Python" might look like Romaji-intensive
> Japanese, i.e. "heavy on the Latin-1 pepper, other spices" (yet
> lots of room for top-level class, function, variable names, libraries
> stuffed with them, all outside Latin-1).
> These concerns have been a long term focus, and continue
> to be, as Python students I encounter may be there for work
> and that may mean using non-Latin-1 Python namespaces
> much of the time.
> Edu-sig mailing list
> Edu-sig at python.org
Silent Thunder is my name, and Children are my nation.
The Cosmos is my dwelling place, the Truth my destination.
More information about the Edu-sig