OT: Programmers whos first language is not English

Stephen Horne intentionally at blank.co.uk
Sun Mar 9 18:44:58 CET 2003

On Sat, 08 Mar 2003 17:25:08 GMT, "Stuart D. Gathman"
<stuart at bmsi.com> wrote:

>On Sat, 08 Mar 2003 06:21:36 -0500, Stephen Horne wrote:
>> One thing I'm considering is the use of a non-ASCII source code.
>Use unicode.  This works very well in Java - although it is "fun" to read
>Java code written in India with identifiers displayed in beautiful
>non-Latin characters - matching the identifiers feels like playing
>> In particular, I'm thinking of using XML - not as an AST representation,
>> but merely as a way of marking up source code. This would require
>> special editors, of course, but if WYSIWYG editors can be created for
>> HTML I don't see why programmers are still stuck in the plaintext age.
>Ack! Yuck!  Don't use XLM!
>> One possible use of XML might be that 'keywords' and 'symbols' could be
>> stored as XML elements specifying non-language-specific tokens - the
>> editor could have a local language table to recognise keywords as the
>You don't need XLM for this.

I think you may be missing the point.

People are not supposed to read or write the XML source directly. They
read an write source code in the programming language - which is
simply richer than can be plain text because it has an underlying
XML-based representation which supports things other than a big
sequence of characters.

No-one can seriously claim that XML is a good language for humans to
read directly. Neither is a sequence of character codes in hex or
binary of whatever - thinking in terms of directly editing the XML is
like thinking in terms of using a hex editor to work on Python source
code, or directly understanding Words binary format in order to write
a word processor document. The *possibility* of human readability in
XML is a nice feature - but only as a last resort.

In short, I'm simply saying that programmers editors should support
functionality beyond simply mapping binary codes to ASCII (or for that
matter, unicode) characters - it's time we were not limited to

>The programs were stored and interpreted tokenized, so everything worked
>flawlessly as it always had.  But program listings were, "amusing", and
>you had to use the new keywords to enter or edit programs.

A good analogy - the XML is the equivalent of the *tokenised* form you
mentioned, which humans are not meant to directly read.

More information about the Python-list mailing list