[Python-Dev] PEP 277 (unicode filenames): please review
Walter Dörwald
walter@livinglogic.de
Tue, 13 Aug 2002 14:13:27 +0200
Jack Jansen wrote:
> [...]
> Here's a transcript of my Python session. The terminal has been set to
> render in latin-1. The directory contains one file, "frör" (fr-o-umlaut-r).
> sap!jack- python
> Python 2.3a0 (#32, Aug 12 2002, 15:31:25)
> [GCC 2.95.2 19991024 (release)] on darwin
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import os
> >>> os.listdir('.')
> ['fro\xcc\x88r']
> >>> utf8name = os.listdir('.')[0]
> >>> unicodename = utf8name.decode('utf-8')
> >>> unicodename
> u'fro\u0308r'
U+0308 is not 'LATIN SMALL LETTER O WITH DIAERESIS' but
'COMBINING DIAERESIS', i.e. the ö got decomposed into
o + 'COMBINING DIAERESIS'.
> [...]
Bye,
Walter Dörwald