An assessment of the Unicode standard

John Machin sjmachin at
Sun Aug 30 14:41:23 CEST 2009

On Aug 30, 4:47 pm, Dennis Lee Bieber <wlfr... at> wrote:
> On Sun, 30 Aug 2009 14:05:24 +1000, Anny Mous <b1540... at>
> declaimed the following in gmane.comp.python.general:
> > Have you thought about the difference between China, with one culture and
> > one spoken language for thousands of years, and Europe, with dozens of
>         China has one WRITTEN language -- It has multiple SPOKEN languages

... hence Chinese movies have subtitles in Chinese. And it can't
really be called one written language. For a start there are the
Traditional characters and the Simplified characters. Then there are
regional variations and add-ons e.g. the Hong Kong Special Character
Set (now added into Unicode): not academic-only stuff, includes
surnames, the "Hang" in Hang Seng Index and Hang Seng Bank, and the
5th character of the Chinese name of The Hongkong and Shanghai Banking
Corporation Limited on the banknotes it issues.

> (the main two being mandarin and cantonese -- with enough differences
> between them that they might as well be spanish vs italian)

Mandarin and Cantonese are groups of languages/dialects. Rough figures
(millions): Mandarin 850, Wu 90, Min and Cantonese about 70 each. The
intelligibility comparison is more like Romanian vs Portuguese, or
Icelandic vs Dutch. I've heard that the PLA used Shanghainese (Wu
group) as code talkers just like the USMC used Navajos.

More information about the Python-list mailing list