PEP 3131: Supporting Non-ASCII Identifiers
Josiah Carlson
josiah.carlson at sbcglobal.net
Sun May 13 15:58:27 EDT 2007
Stefan Behnel wrote:
> Anton Vredegoor wrote:
>>> In summary, this PEP proposes to allow non-ASCII letters as
>>> identifiers in Python. If the PEP is accepted, the following
>>> identifiers would also become valid as class, function, or
>>> variable names: Löffelstiel, changé, ошибка, or 売り場
>>> (hoping that the latter one means "counter").
>> I am against this PEP for the following reasons:
>>
>> It will split up the Python user community into different language or
>> interest groups without having any benefit as to making the language
>> more expressive in an algorithmic way.
>
> We must distinguish between "identifiers named in a non-english language" and
> "identifiers written with non-ASCII characters".
[snip]
> I do not think non-ASCII characters make this 'problem' any worse. So I must
> ask people to restrict their comments to the actual problem that this PEP is
> trying to solve.
Really? Because when I am reading source code, even if a particular
variable *name* is a sequence of characters that I cannot identify as a
word that I know, I can at least spell it out using Latin characters, or
perhaps even attempt to pronounce it (verbalization of a word, even if
it is an incorrect verbalization, I find helps me to remember a variable
and use it later).
On the other hand, the introduction of some 60k+ valid unicode glyphs
into the set of characters that can be seen as a name in Python would
make any such attempts by anyone who is not a native speaker (and even
native speakers in the case of the more obscure Kanji glyphs) an
exercise in futility.
As it stands, people who use Python (and the vast majority of other
programming languages) learn the 52 upper/lowercase variants of the
latin alphabet (and sometimes the 0-9 number characters for some parts
of the world). That's it. 62 glyphs at the worst. But a huge portion
of these people have already been exposed to these characters through
school, the internet, etc., and this isn't likely to change (regardless
of the 'impending' Chinese population dominance on the internet).
Indeed, the lack of the 60k+ glyphs as valid name characters can make
the teaching of Python to groups of people that haven't been exposed to
the Latin alphabet more difficult, but those people who are exposed to
programming are also typically exposed to the internet, on which Latin
alphabets dominate (never mind that html tags are Latin characters, as
are just about every daemon configuration file, etc.). Exposure to the
Latin alphabet isn't going to go away, and Python is very unlikely to be
the first exposure programmers have to the Latin alphabet (except for
OLPC, but this PEP is about a year late to the game to change that).
And even if Python *is* the first time children or adults are exposed to
the Latin alphabet, one would hope that 62 characters to learn to 'speak
the language of Python' is a small price to pay to use it.
Regarding different characters sharing the same glyphs, it is a problem.
Say that you are importing a module written by a mathematician that
uses an actual capital Greek alpha for a name. When a user sits down to
use it, they could certainly get NameErrors, AttributeErrors, etc., and
never understand why it is the case. Their fancy-schmancy unicode
enabled terminal will show them what looks like the Latin A, but it will
in fact be the Greek Α. Until they copy/paste, check its ord(), etc.,
they will be baffled. It isn't a problem now because A = Α is a syntax
error, but it can and will become a problem if it is allowed to.
But this issue isn't limited to different characters sharing glyphs!
It's also about being able to type names to use them in your own code
(generally very difficult if not impossible for many non-Latin
characters), or even be able to display them. And no number of
guidelines, suggestions, etc., against distributing libraries with
non-Latin identifiers will stop it from happening, and *will* fragment
the community as Anton (and others) have stated.
- Josiah
More information about the Python-list
mailing list