does/will python support unicode?

François Pinard pinard at iro.umontreal.ca
Wed Mar 8 14:20:34 EST 2000


"Shaun Hogan" <shogan at iel.ie> writes:

> The Unicode Standard is the universal character encoding standard used for
> representation of text for computer processing.

Universal is much more an intent than a fact.  Some part of the road has
been travelled already, granted, but there is still a long way to go.

> The design of Unicode is based on the simplicity and consistency of
> ASCII, but goes far beyond ASCII's limited ability to encode only the
> Latin alphabet.

This is also true for most existing character sets.

> The Unicode Standard provides the capacity to encode all of the characters
> used for the written languages of the world.

This is more hype than reality.  Beware, beware.  You may of course conjugate
"capacity" to future tense, and with 2^31, yielding some vapour talk.

> To keep character coding simple and efficient, the Unicode Standard assigns
> each character a unique 16-bit value, and does not use complex modes
> or escape codes.

Absolutely false.  Unicode has lost its virginity for good, since a good
while.  Unicode is only simple for the simplest charsets it tries to be
compatible with.  (Even mere ASCII is causing some unexpected trouble...)


Do not read me as saying that Unicode is a bad thing.  All the contrary, it
is good news to me that more tools and systems move towards wider charsets.
However, I'm always surprised to see how highly (and often unrealistically)
people seem to praise Unicode.  We surely have to recognise good things,
and stay oriented towards progress.  We should _also_ keep our feet well
on the ground, and avoid loosing our good judgement.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard






More information about the Python-list mailing list