[docs] [issue11230] "Full unicode import system" not in 3.2
Tom Christiansen
report at bugs.python.org
Fri Aug 12 04:36:31 CEST 2011
Tom Christiansen <tchrist at perl.com> added the comment:
How does this work for modules that have filesystem names different from the one used for import? The issue I'm thinking about is that the Mac HSF+ filesystem keeps its Unicode filenames in (close to) NFD form. That means that a module named "caf\N{LATIN SMALL LETTER E WITH ACUTE}" with 4 graphemes and 4 code points in its name winds up in the filesystem as "cafe\N{COMBINING ACUTE ACCENT}" still with 4 graphemes but now with 5 code points.
I believe (well, suspect; I have empirical evidence not proof) Python stores its own identifiers in NFD, so this may not be quite as much of a problem as it might otherwise be. Nonetheless, I have had users complain about what HFS+ does with such filenames, although I am not quite sure why. I think it’s because they access a file with 4 chars but they need a 5-char fileglob to wildcard it, so touch "caf\N{LATIN SMALL LETTER E WITH ACUTE}" and then you need a wildcard of "?????" with an extra ? to find it. Kinda weird.
----------
nosy: +tchrist
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue11230>
_______________________________________
More information about the docs
mailing list