[Python-3000] Support for PEP 3131

Guillaume Proux gproux+py3000 at gmail.com
Mon May 14 03:09:29 CEST 2007


> > Interestingly, this is *not* a well known fact. I have asked 2
> > friend-of-mine seasoned Java programmers and they were *amazed* that
> > this is supported.
> Well, maybe we should add it to Python as a secret feature. :-) :-) :-)

But they also said that:
1) they wish they would have known earlier...
2) would start using this immediatly for their own small projects

> >  see e.g. http://lists.xml.org/archives/xml-dev/200107/msg00254.html
> I imagine the situation there is sufficiently different though; XML is
> data, not code.

I wish you had enough time to read some of the posts linked from the
above URL. In particular, you can see the viewpoint of some Japanese
people on the ability for them to describe data structures (which is
really a programming concept) in their own words.

> I realize you've added a smiley, but please, don't propose new
> features for a release that's already been released. The release
> managers will put you in jail and not let you out until 4.0 has been
> released. :-)

eheheheh :)

> Because most people still use systems that have very inadequate tools
> for handling non-ASCII text, especially non-Latin-1 text. For example,
> at work I use Ubuntu, a modern Linux distribution actively supported
> by a company headquartered in South-Africa. Their main market lies
> outside Europe and North America. And yet, there is no standard way to
> enter non-ASCII characters as basic as c-cedilla or u-umlaut; the main

I also use Ubuntu at home.
Regarding your issue: hum? you can change keyboard layout (I even
think it does affect the current input system immediatly). Also there
is a number of tools like gucharmap
(http://gucharmap.sourceforge.net/shots/shot-003.png) that enables you
to copy paste rare characters.

> tools I use (Emacs, Firefox and bash running in a terminal emulator)
> all have different input methods, different ideas of the default
> character encoding, and so on. It's a crapshoot whether
> copy-and-pasting even the simplest non-ASCII text (like the name of
> PEP 3131's author :-) between any two of these will work.

Ubuntu Feisty (and I think Edgy too) default on UTF8 everywhere and I
have never had any issue using French, Japanese and English anywhere.
Windows came to this maturity point about 5-6 years ago.

> I see program code as a tool for communication between people. Note
> how you & I are using English in this thread even though it is not the
> mother tongue for either of us. So we use English, since we can both
> read and write it reasonably well. This is the *only* way that
> programmers raised in different countries can exchange code at all.

I *totally* agree with you, you sometimes need to go down to the
lowest common denominator (with tongue in cheek)... But I still do not
understand that you are not happy to see people become more productive
with Python when there is no need of international exchange: the small
(or large) internal application,  the throw-away script, the ability
to extend C programs with a scripting language that is respectful of
the native language of the (mostly-non programmer) user etc...

> gets 1000x better, but we're not there yet -- try translate.google.com
> if you don't believe me.)

I hope you get bonus points at work for mentioning this one. Believe
it or not, translate.google.com is my friend!

> You're stretching my words there. The issue if translation hadn't

Clearly you could not think of this issue, but I am not stretching
your word. I was just reusing some of the *strong* points you made why
you thought Python was such a great invention of yours (and don't get
me wrong, we all love it!). I was just applying those great points to
this new issue which I believe fully deserve more attention.

> crossed my mind when I wrote that (over 10 years ago) and the tools
> *really* weren't ready then. And regarding readability, if all the

The tools are ready now. We live in a mostly fully unicode world now,
and we just agreed in another PEP that the default source encoding of
files will be UTF8...

> programmers in the world agreed to use broken English, the readability
> of their code to each other would be much better dan als we allemaal
> in onze eigen taal schreven.

The funny thing is that I can read this sentence very well: my life
was spent surrounded by latin characters. I can even probably
understand it as I can speak some German too.
allesmaal -> Jedesmal -> always
onze -> eine -> its
eigen -> eigen -> own
taal ->  sprache -> language
schreven -> schreiben -> write

My cultural background can help me decipher VERY QUICKLY what you
wrote. But think of the 7 years old Japanese child. They are not
taught latin characters really before they will seriously learn
English... but this is the year I started programming (by copying
french listing of programs for Thomson TO7-70 computers... oh my



More information about the Python-3000 mailing list