[Mailman-i18n] "Funny" characters in real names?

Martin v. Löwis loewis@informatik.hu-berlin.de
06 Oct 2002 12:49:41 +0200


barry@python.org (Barry A. Warsaw) writes:

> Well, I spent a little time playing with un-define, and googling
> around, but wasn't able to come up with the magic incantations.  Maybe
> this XEmacs FAQ entry sheds light that we have a while to wait yet...

In GNU Emacs, the situation is slightly different: It recognizes a few
Unicode subsets, like mule-unicode-0100-24ff, mule-unicode-2500-33ff,
and mule-unicode-e000-ffff. Notice that this excludes the CJK
ideographs.

The real problem is the ISO-2022 inheritance of Mule: While a buffer
can represent multiple charsets, it can rarely equate characters
across charset borders. So if two originate from two different
charsets, they are considered different (unless there is a conversion
procedure for this specific pair of charsets).

In CVS Emacs, the "Latin unification" is considered a big step
forward, ie. the fact that Emacs can now treat the overlapping
character from iso-8859-{1,2,15,16} as equivalent.

Without unification, Emacs has difficulties displaying the characters:
Even if there is a font that has the necessary character, Emacs cannot
find out that this font could be used if the font encoding is
different from the buffer encoding.

These problems would go away had Emacs used Unicode as an internal
encoding, but there is some hostility from the Mule authors towards
Unicode, in particular because of the need for Unihan disambiguation.

Regards,
Martin