[Python-Dev] PEP 277 (unicode filenames): please review

Walter Dörwald walter@livinglogic.de
Tue, 13 Aug 2002 14:13:27 +0200


Jack Jansen wrote:

> [...]
> Here's a transcript of my Python session. The terminal has been set to 
> render in latin-1. The directory contains one file, "frör" (fr-o-umlaut-r).
> sap!jack- python
> Python 2.3a0 (#32, Aug 12 2002, 15:31:25)
> [GCC 2.95.2 19991024 (release)] on darwin
> Type "help", "copyright", "credits" or "license" for more information.
>  >>> import os
>  >>> os.listdir('.')
> ['fro\xcc\x88r']
>  >>> utf8name = os.listdir('.')[0]
>  >>> unicodename = utf8name.decode('utf-8')
>  >>> unicodename
> u'fro\u0308r'

U+0308 is not 'LATIN SMALL LETTER O WITH DIAERESIS' but
'COMBINING DIAERESIS', i.e. the ö got decomposed into
o + 'COMBINING DIAERESIS'.

> [...]


Bye,
    Walter Dörwald