[Edu-sig] Musings on PEP8

Mon Jul 18 20:29:36 CEST 2011

Hi Vernon,

... not to be confused with Vern "the Watcher" Ceder.

On Mon, Jul 18, 2011 at 8:47 AM, Vernon Cole <vernondcole at gmail.com> wrote:

> There is a very good reason for this:  standard library code must be
> readable for people all over the world.  That's why a Dutch software
> engineer wrote a language in which all the keywords and commentary are in
> English.
>
>
Yes, the Standard Library is to be Anglicized for some time to come,
maybe always, per Guido's talks.

Of course there's nothing to stop someone from writing a translator
for the Standard Library, such that the source originals (as modified)
might be rendered in myriad other charactersets.

Top-level names tend to be amenable to such treatment.

This may be done down to the C family level, though I'm not suggesting
that it should be (nor are all Python implementations C family I hasten
to add, (a Jython is "C family" if the Java VM is)).

The same is not true for 3rd party modules which, as you say,
may be written in any style.

Learning the Latin (English) alphabet, building a vocabulary, remains
a good idea obviously, along with ASCII in the context of Unicode.

I expect those focused in computer science will continue giving
themselves the benefit of this learning.

I received Romanized Indonesian source code for quite awhile, until
the student moved to Japan and apparently stopped doing Python.

I'm impressed with all the alphabets you know.

3rd party modules written in Cyrillic with the peppering of
Roman we know must be there, thanks to Standard Library
(untranslated) and the 33 keywords (so far), could be used
in computer science to help English speakers learn a
Cyrillic language.

http://en.wikipedia.org/wiki/Languages_written_in_a_Cyrillic-derived_alphabet

> >
> > The flip side argument, which I find more persuasive, is that
> > one of the biggest barriers to diversity is over-reliance on Latin-1,
> > and "just ASCII" in particular.
> >
> > The whole point of Unicode was to open up source code writing,
> > as an occupation, to more than just Euro-English speakers.
>
> I disagree.  The whole point of Unicode is to open up application writing,
> so that _users_ can see computer output in their own languages.  A person
> who wishes to pursue code writing as an occupation must understand and use
> English -- or be relegated to producing work only for his own culture.  In
> the modern "flat" world, English is the language of commerce and computer
> programming.  Not being able to write understandable English is a severe
> handicap. My programs are written in Python, documented in English, and
> usable by persons of another language.  For example, see CaesarCalc.py from
> https://launchpad.net/romanclass , which assumes the user to be able to
> understand pigeon Latin. Even then, I give the result of (XVI - XVI) as
> "Nulla" because I expect that most users will not recognize "Nvlla" as
> meaning "nothing."
>
>
Certainly the GUI needs to be intelligible yes.

Lets just say there's a school of thought that has
no problem with a math, logic or grammar teacher
using only Chinese characters for top level names
in various exercises using Python or other
Unicode aware computer language.  And no
problem with another teacher using only Hebrew
characters for top level names and so on.

This school of though hangs out on the Python
Diversity list and self-organizes there.  If you go
back in the archives, you'll find myself and a
guy named Carl doing stuff in the Python wiki
to expand the language base, including at the
source code level.  With Pycon / Tehran in the
planning, we want to be in a better position to
address issues relating GeoDjango to Farsi, say.

These exercises (mentioned above) may have
nothing to do with writing commercial applications.
These may not be programmers in training
(though some may be in commercial media,
where "programming" also has meaning (e.g.
in radio / TV)).  Instead of using a calculator
or abacus to learn numeracy skills, people
have laptops and internet access.

Having readable source code in languages
that aren't in a Roman alphabet is already
a spreading phenomenon, with many writers
happily giving up that so-called "world readability"
in favor of remaining intelligible to the girl or boy
next door.

The syntax of URIs and domain names has
already taken this turn.  You will have http//arabic letters//
quite frequently these days, thanks to the
Unicode basis of http (which Python now needs
to deal with, and does, as an http-aware language).

CSS for Arabic is the kind of style concern for
which we may have insufficient literature to date.
We may have people joining Diversity who want to
develop that literature (recruiting happening).

http://www.guardian.co.uk/technology/2010/may/06/arabic-web-addresses-internet

Here is sample output.  Notice that, when it blows up the traceback is in
> Python with English explanations:
> <console dump>
> procer numerus hic:III - II
> I
> procer numerus hic:3 - 2
> I
> procer numerus hic:3 - 3
> Nulla
> procer numerus hic:2 - 3
> Traceback (most recent call last):
>   File "CaesarCalc.py", line 40, in <module>
>     print (cvt(subtrahends[0]) - cvt(subtrahends[1]))
>   File "/home/vernon/romanclass-1.0.1/romanclass.py", line 99, in __sub__
>     return Roman(self.__int__() - other)
>   File "/home/vernon/romanclass-1.0.1/romanclass.py", line 85, in __new__
>     raise OutOfRangeError, 'Cannot store "%s" as Roman' % repr(N)
> romanclass.OutOfRangeError: Cannot store "-1" as Roman
> </console dump>
>
> IMHO, on the whole, PEP 8 is a pretty good document.
> --
> Vernon
>

I'm not denigrating PEP8 in any way, even though
I used some light sarcasm in my post.  That was
not directed against PEP8, so much as against
the idea that the "rule book" is somehow complete,
just because we have it down that functions should
generally not start with a capital letter, and
l (lowercase L) is a terrible name for all purposes
because it's so indistinguishable from uppercase
I and the number 1 in many fonts.

I think as people start getting a lot more experience
writing Python with different namespaces, with
non-Roman top-level names etc., that the rule
book is inevitably going to expand and that a
Book of Styles could conceivably become enormous.

But then think of English:  we acknowledge many
styles as being appropriate and don't have just
the one "book" where style is concerned (we have
so many) -- not like the dictionary, with a goal of
including every word in a finite and deliberately
exclusive set of standard words.

I have some examples of Python source in my
blogs, using kanji as top-level names (might be
a Japanese program, as one of the kanji is for
Mt. Fuji as I recall).

Then there's some tracking down Stallman on
a visit to Sri Lanka (awhile back) and chatter
about Python in Tamil and Sinhalese.  And yes,
I am aware English is spoken in this parts as well,
as evidenced by Arthur C. Clarke's having lived
there for so long.  One of our CSN chiefs has a
track record there too, another English speaker.

http://www.sarvodaya.org/2005/05/17/suzanne-bader%E2%80%99s-sri-lanka-visit-report
http://controlroom.blogspot.com/2009/01/at-work.html
http://risenfall.wordpress.com/2008/01/14/richard-stallman-rms-is-in-sri-lanka/

Kirby
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/edu-sig/attachments/20110718/8feae5bd/attachment-0001.html>