encoding problems (é and è)

"Martin v. Löwis" martin at v.loewis.de
Sat Mar 25 14:31:04 CET 2006


Serge Orlov wrote:
> The problem is that U+0587 is a ligature in Western Armenian dialect
> (hy locale) and a character in Eastern Armenian dialect (hy_AM locale).
> It is strange the code point is marked as compatibility char. It either
> mistake or political decision. It used to be a ligature before
> orthographic reform in 1930s by communist government in Armenia, then
> it became a character, but after end of Soviet Union (1991) they
> started to think about going back to old orthography. Though it hasn't
> happened and it's not clear if it will ever happen. So U+0587 is a
> character.

Thanks for the explanation. Without any knowledge, I would suspect
a combination of mistake and political decision. The Unicode consortium
(and ISO) always uses native language experts to come up with character
definitions, although the process is today likely more elaborate and
precise than in the early days. Likely, the Unicode consortium found
somebody speaking the Western Armenian dialect (given that many of these
speakers live in North America today); the decision might have been
a mixture of lack of knowledge, ignorance, and perhaps even political
bias.

Regards,
Martin



More information about the Python-list mailing list