Using non-ascii symbols
pwatson at redlinepy.com
Tue Jan 24 17:05:09 CET 2006
Christoph Zwerschke wrote:
> On the page http://wiki.python.org/moin/Python3%2e0Suggestions
> I noticed an interesting suggestion:
> "These operators ≤ ≥ ≠ should be added to the language having the
> following meaning:
> <= >= !=
> this should improve readibility (and make language more accessible to
> This should be an evolution similar to the digraphe and trigraph
> (digramme et trigramme) from C and C++ languages."
> How do people on this group feel about this suggestion?
> The symbols above are not even latin-1, you need utf-8.
> (There are not many usefuls symbols in latin-1. Maybe one could use ×
> for cartesian products...)
> And while they are better readable, they are not better typable (at
> least with most current editors).
> Is this idea absurd or will one day our children think that restricting
> to 7-bit ascii was absurd?
> Are there similar attempts in other languages? I can only think of APL,
> but that was a long time ago.
> Once you open your mind for using non-ascii symbols, I'm sure one can
> find a bunch of useful applications. Variable names could be allowed to
> be non-ascii, as in XML. Think class names in Arabian... Or you could
> use Greek letters if you run out of one-letter variable names, just as
> Mathematicians do. Would this be desirable or rather a horror scenario?
> -- Christoph
This will eventually happen in some form. The problem is that we are
still in the infancy of computing. We are using stones and chisels to
express logic. We are currently faced with text characters with which
to express intent. There will come a time when we are able to represent
a program in another form that is readily portable to many platforms.
In the meantime (probably 50 years or so), it would be advantageous to
use a universal character set for coding programs. To that end, the
input to the Python interpreter should be ISO-10646 or a subset such as
Unicode. If the # -*- coding: ? -*- line specifies something other than
ucs-4, then a preprocessor should convert it to ucs-4. When it is
desireable to avoid the overhead of the preprocessor, developers will
find a way to save source code in ucs-4 encoding.
The problem with using Unicode in utf-8 and utf-16 forms is that the
code will forever need to be written and forever execute additional
processing to handle the MBCS and MSCS (Multiple-Short Character Set)
Ok. Maybe computing is past infancy. But most development environments
are not much past toddler stage.
More information about the Python-list