[Python-3000] Support for PEP 3131

Sun May 13 17:22:03 CEST 2007

On 5/12/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:
[snip]
> In this respect, I strongly believe that support non-ASCII identifiers
> as proposed by PEP3131 would improve a number of things:
> - discussion and uptake of python in "non-ascii" countries
> - ability for children to learn programming in their own language (I
> started programming at 7 years old and would have been very disturbed
> if I could not use my own language to type in programs)
> - increase of the number of new "interesting" packages from non-ascii countries
> - ability for local programmers and local companies to provide
> "bridges" between international (english) APIs and local APIs.
> - Increase the number of python users (from 7 to 77 years old)

Says you. So far, all I've seen from PEP 3131's supporters is a lot of
hollow assertions and idle theorizing: "Python will be easier to use
for people using non-ASCII character sets", "Python will be easier to
learn for those raised with non-Roman-influenced languages", etc, etc.
Until I see some kind of evidence, something to back up these claims,
I'm going to assume you're wrong.

Have there been studies on this kind of thing? Has there been any
research into whether a mixture of English keywords and, say, Japanese
and English identifiers makes a given programming language easier to
learn and use? If so, why aren't they referenced in the PEP or linked
in any emails? Given the lack of evidence presented so far, my
operating assumption is that the PEP's supporters -- including you --
are making things up to support a conclusion that they might wish to
be true.

> In my humble opinion, now that UTF8 is accepted as the standard source
> code encoding, it is very difficult to understand why we should start
> putting restrictions on the kind of identifiers that are used (which
> would force people to comment line by line as they do now!).
>
> When I am programming in Python, I am VERY DISTURBED when the code I
> write contains much comment. It needs to be readable just by glancing
> at it.
>
> However, for most of the people who are core python developers, you
> should ask what is the typical reading speed for "ascii" characters
> for a e.g. standard Japanese pupil. You would be very surprised how
> slow that is. In my opinion (after leaving in Japan for quite a bit),
> people are very slow to read ASCII characters and this definitely
> restrain their programming productivity and expressiveness.

See, that's the thing I have yet to see addressed: there's been lot of
stress on "being able to write variable/class/method names in
Arabic/Mandarin/Hindi will make it easier for native speakers to
understand", but as far as I know, no-one has yet addressed how these
non-English identifiers will mesh with the existing English keywords
and English standard library functions. You say that being able to
write identifiers in Cyrillic will make Python easier for Russian
natives to read, to make Python code as you say, "readable just by
glancing at it". But the fact is any native-language identifiers will
be surrounded in a sea of English: keywords, the standard library,
almost all open-source packages, etc. How does that impact your
readability guesses?

Also, method/function names are traditionally expressed in English as
verb phrases (e.g., "isElementVisible()") which dovetail nicely with
Anglo-centric keywords like "if" and "for ... in ...". How do
identifiers in languages with dramatically different grammars like
Japanese -- or worse, different reading orders like Farsi and Hebrew
-- interact with "if", "while" and the new "x if y else z" expression,
which are deeply rooted in English grammar? My suspicion is, at least
for right-to-left languages like Arabic, not well, if at all.

Lastly, I take issue with one of the PEP's guidelines under the
"Policy Specification" section: "All identifiers in the Python
standard library...SHOULD use English words wherever feasible"
(emphasis in the original). Are we now going to admit the possibility
that part of the standard library will be written in English, some
parts will be written in Spanish and this one module over there will
be written in Czech? Absolutely ludicrous.

Come-on-tell-us-how-you-really-feel-ly,
Collin Winter