[issue10952] Don't normalize module names to NFKC?

Alexander Belopolsky report at bugs.python.org
Thu Jan 20 07:40:27 CET 2011


Alexander Belopolsky <belopolsky at users.sourceforge.net> added the comment:

On Thu, Jan 20, 2011 at 1:19 AM, Martin v. Löwis <report at bugs.python.org> wrote:
..
> I'd like to request that PEP 3131 is followed as it stands: identifier lookup uses NFKC,
> period. This gives two issues: a) how can users make sure that they name the files
> correctly? and b) what if the file system implementation mangles file names.
>

There is also issue c) what if the filesystem encoding can only
represent a compatibility character, say U+00B5, but not its NFKC
equivalent, U+03BC?  Suppose you have a system with both locale and FS
encodings being Latin-1.  You can write Python code using Latin-1 and
the following is valid bytestream:

b'# encoding: latin-1\nimport \xB5Torrent\n"

However, this code will always fail because '\xB5Torrent' will be
normalized into '\u03BCTorrent' and a file named '\u03BCTorrent.py'
cannot be created on a filesystem with Latin-1 encoding.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue10952>
_______________________________________


More information about the Python-bugs-list mailing list