An assessment of the Unicode standard

Terry Reedy tjreedy at udel.edu
Sun Aug 30 07:26:55 CEST 2009


r wrote:

> natural languages and Unicode. Which IMO * Unicode* is simply a monkey
> patch for this soup of multiple languages we have to deal with in
> programming and communication.

A somewhat fair charactierization.

[snip]

> everyone happy? A sort of Utopian free-language-love-fest-kinda-
> thing?

Not utopian, but pragmatically political. Before unicode, and still 
today, we had and have multiple codes. Multiple ascii extenstions for 
European languages and even multiple codes just for Japanese. To get 
people in the major computing countries, including Japan, to agree to 
eventually replace their national codes with one worldwide code, some 
kludgy compromises were made.

> language. The A-Z char set is flawless!

Hardly. There are too few characters. A basic set should have at least 
50. The international phonetic alphabet (IPA) has about 150. Here is a 
true Utopian proposal for you (from a non-CS major ;-): develop an 
extended IPA 256-character set with just a few control chars (rather 
than 32) and punctuation and other markers. Then develop dictionaries to 
translate texts in every languange and char set into and back out of 
this universal character set.

Fat chance of approval, even if techical issues were resolved.

> Some may say well how can we possibly force countries/people to speak/
> code in a uniform manner? Well that's simple, you just stop supporting
> their cryptic languages by dumping Unicode and returning to the
> beautiful ASCII

But most everyone outside the US was not using ascii precisely because 
it did not support their language.

Get over the imperfections of unicode. It improves on the prior status quo.

Terry Jan Reedy




More information about the Python-list mailing list