Unicode Unification Objections
effbot at telia.com
Mon May 8 20:46:19 CEST 2000
Aahz Maruch <aahz at netcom.com> wrote:
> >the distinction cannot be preserved in a naked unicode character
> >stream, but it's done that way on purpose. you cannot really handle
> >text strings correctly (rendering, sorting, comparing, etc) unless you
> >have language and locale information.
> >this is as true for unicode as it is for latin 1 or any other
> >character set. after all, the "western culture" isn't really as homo-
> >geneous as you americans seem to think ;-)
> In other words, "someone" needs to devise a standardized system that
> encodes all the information needed to represent a string. To deal with
> the cases Dennis talks about, you need to concatenate multiple string
> objects into some larger buffer. Am I understanding you?
XML supports language markup (the xml:lang attribute).
language/locale information can also be used in HTTP content tags,
MIME headers, etc.
in 31-bit unicode, there's also something called "plane 14 language tags"
which can (in theory, at least) be used to insert language codes in a uni-
More information about the Python-list