[Python-Dev] PEP 393 Summer of Code Project

Guido van Rossum guido at python.org
Thu Sep 1 18:31:53 CEST 2011


On Thu, Sep 1, 2011 at 9:03 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> Le jeudi 01 septembre 2011 à 08:45 -0700, Guido van Rossum a écrit :
>> This is definitely thought of as a separate
>> mark added to the e; ë is not a new letter. I have a feeling it's the
>> same way for the French and Germans, but I really don't know.
>> (Antoine? Georg?)
>
> Indeed, they are not separate "letters" (they are considered the same in
> lexicographic order, and the French alphabet has 26 letters).
>
> But I'm not sure how it's relevant, because you can't remove an accent
> without most likely making a spelling error, or at least changing the
> meaning. Accents are very much part of the language (while ligatures
> like "ff" are not, they are a rendering detail). So I would consider
> "é", "ê", "ù", etc. atomic characters for the purpose of processing
> French text. And I don't see how a decomposed form could help an
> application.

The example given was someone who didn't agree with how a particular
font rendered those accented characters. I agree that's obscure
though.

I recall long ago that when the french wrote words in all caps they
would drop the accents, e.g. ECOLE. I even recall (through the mists
of time) observing this in Paris on public signs. Is this still the
convention? Maybe it only was a compromise in the time of Morse code?

-- 
--Guido van Rossum (python.org/~guido)


More information about the Python-Dev mailing list