On 10/8/12, Greg Ewing greg.ewing@canterbury.ac.nz wrote:
Ronald Oussoren wrote:
neither statvs, statvfs, nor pathconf seem to be able to tell if a filesystem is case insensitive.
Even if they could, you wouldn't be entirely out of the woods, because different parts of the same path can be on different file systems...
But how important is all this anyway? I'm trying to think of occasions when I've wanted to compare two entire paths for equality, and I can't think of *any*.
I can think of several, but when I thought a bit harder, they were mostly bug attractors.
If I want my program (or a dict) to know that "CONFIG" and "config" are the same, then I also want it to know that "My Documents" is the same as "MYDOCU~1".*
Ideally, I would also have a way to find out that a pathname is likely to be problematic for cross-platform uses, or at least whether two specific pathnames are known to be collision-prone on existing platforms other than mine. (But I'm not sure that sort of test can be reliable enough for the stdlib. Would just check for caseless equality, reserved Windows names, and non-alphanumeric characters in the filename?)
*(Well, assuming it is. The short name depends on the history of the directory.)
-jJ
On Tue, Oct 16, 2012 at 6:21 AM, Jim Jewett jimjjewett@gmail.com wrote:
Ideally, I would also have a way to find out that a pathname is likely to be problematic for cross-platform uses, or at least whether two specific pathnames are known to be collision-prone on existing platforms other than mine. (But I'm not sure that sort of test can be reliable enough for the stdlib. Would just check for caseless equality, reserved Windows names, and non-alphanumeric characters in the filename?)
I'd forgotten about it until reading this, but I think you can get into trouble with Unicode normalisation as well - so, I think we can safely dismiss this as an irrelevant tangent and just stick with Antoine's basic Windows vs Posix distinction. If need be, the strategies can be exposed at a later date (via keyword-only arguments) if we come up with a more convincing use case.
Cheers, Nick.
On Mon, Oct 15, 2012 at 04:21:50PM -0400, Jim Jewett wrote:
If I want my program (or a dict) to know that "CONFIG" and "config" are the same, then I also want it to know that "My Documents" is the same as "MYDOCU~1".*
Well, perhaps you do, but those not using Windows are unlikely to care about DOS short names.
However, they may care about some other form of short name. E.g. on iso9660 file systems (CDs) long names are just truncated; if two truncated names clash, the second and subsequent file is given a three digit suffix:
this-is-a-long-file THIS-IS-A-LONG-NAME My Documents
get renamed to:
THIS_IS_ THIS_000 MY_DOCUM
although my Linux computer displays those names in lower case. The Rock Ridge and Joliet extensions can record the unmangled file names, but not all CDs use them.
It is not the case that all case-insensitive file systems necessarily support DOS short names. There are file systems that don't support long names at all, there are case-insensitive file systems that preserve case, and those that don't.
It's not even necessarily so that Windows is always case-insensitive:
http://support.microsoft.com/kb/929110