[Python-Dev] casefolding in pathlib (PEP 428)
Ronald Oussoren
ronaldoussoren at mac.com
Fri Apr 12 15:10:42 CEST 2013
On 12 Apr, 2013, at 15:00, Christian Heimes <christian at python.org> wrote:
> Am 12.04.2013 14:43, schrieb Ronald Oussoren:
>> At least for OSX the kernel will normalize names for you, at least for HFS+,
>> and therefore two names that don't compare equal with '==' can refer to the
>> same file (for example the NFKD and NFKC forms of Löwe).
>>
>> Isn't unicode fun :-)
>
> Seriously, the OSX kernel normalizes unicode forms? It's a cool feature
> and makes sense for the user's POV but ... WTF?
IIRC only for HFS+ filesystems, it is possible to access files on an NFS share
where the filename encoding isn't UTF-8.
>
> Perhaps we should use the platform's API for the job. Does OSX offer an
> API function to create a case folded and canonical form of a path?
> Windows has PathCchCanonicalizeEx().
This would have to be done on a per path element case, because every directory
in a file's path could be on a separate filesystem with different conventions
(HFS+, HFS+ case sensitive, NFS mounted unix filesystem).
I have found sample code that can determine if a directory is on a case sensitive
filesystem (attached to <http://lists.apple.com/archives/darwin-dev/2007/Apr/msg00036.html>,
doesn't work in a 64-binary but I haven't check yet why is doesn't work there).
I don'tknow if there is a function to determine the filesystem encoding, I guess
assuming that the special casing is only needed for HFS+ variants could work but
I'd have test that.
Ronald
More information about the Python-Dev
mailing list