[Python-Dev] PEP 277 (unicode filenames): please review

Martin v. Loewis martin@v.loewis.de
13 Aug 2002 16:54:08 +0200


Guido van Rossum <guido@python.org> writes:

> Looks like it isn't you: the filename somehow contains a character
> that's not in the Latin-1 subset of Unicode, and no encoding can fix
> that for you.  I don't know why -- you'll have to figure out why your
> keyboard generates that character when you type o-umlaut.

As Walter explains, he has \u006f\u0308, which is

\N{LATIN SMALL LETTER O}\N{COMBINING DIAERESIS}

This could be normalized to

\N{LATIN SMALL LETTER O WITH DIAERESIS}

which then can be encoded as Latin-1. This, of course, requires the
databases for normalization (canonical composition and decomposition).

Regards,
Martin