
Le jeudi 01 septembre 2011 à 08:45 -0700, Guido van Rossum a écrit :
This is definitely thought of as a separate mark added to the e; ë is not a new letter. I have a feeling it's the same way for the French and Germans, but I really don't know. (Antoine? Georg?)
Indeed, they are not separate "letters" (they are considered the same in lexicographic order, and the French alphabet has 26 letters). But I'm not sure how it's relevant, because you can't remove an accent without most likely making a spelling error, or at least changing the meaning. Accents are very much part of the language (while ligatures like "ff" are not, they are a rendering detail). So I would consider "é", "ê", "ù", etc. atomic characters for the purpose of processing French text. And I don't see how a decomposed form could help an application. Regards Antoine.