[Python-Dev] casefolding in pathlib (PEP 428)

Antoine Pitrou solipsis at pitrou.net
Thu Apr 11 23:27:06 CEST 2013


On Thu, 11 Apr 2013 14:11:21 -0700
Guido van Rossum <guido at python.org> wrote:
> Hey Antoine,
> 
> Some of my Dropbox colleagues just drew my attention to the occurrence
> of case folding in pathlib.py. Basically, case folding as an approach
> to comparing pathnames is fatally flawed. The issues include:
> 
> - most OSes these days allow the mounting of both case-sensitive and
> case-insensitive filesystems simultaneously
> 
> - the case-folding algorithm on some filesystems is burned into the
> disk when the disk is formatted

The problem is that:
- if you always make the comparison case-sensitive, you'll get false
  negatives
- if you make the comparison case-insensitive under Windows, you'll get
  false positives

My assumption was that, globally, the number of false positives in case
(2) is much less than the number of false negatives in case (1).

On the other hand, one could argue that all comparisons should be
case-sensitive *and* the proper way to test for "identical" paths is to
access the filesystem. Which makes me think, perhaps concrete paths
should get a "samefile" method as in os.path.samefile().

Hmm, I think I'm tending towards the latter right now.

Regards

Antoine.


More information about the Python-Dev mailing list