[Python-Dev] Windows/Cygwin/MacOSX import (was RE: python-dev summary, 2001-02-01 - 2001-02-15)
Tim Peters
tim.one@home.com
Fri, 16 Feb 2001 03:05:10 -0500
[Michael Hudson]
> ...
> * Imports on case-insensitive file systems *
>
> There was quite some discussion about how to handle imports on a
> case-insensitive file system (eg. on Windows). I didn't follow the
> details, but Tim Peters is on the case (sorry), so I'm confident it
> will get sorted out.
You can be sure the whitespace will be consistent, anyway <wink>.
OK, this one sucks. It should really have gotten a PEP, but it cropped up
too late in the release cycle and it can't be delayed (see below).
Here's the scoop: file systems vary across platforms in whether or not they
preserve the case of filenames, and in whether or not the platform C library
file-opening functions do or don't insist on case-sensitive matches:
case-preserving case-destroying
+-------------------+------------------+
case-sensitive | most Unix flavors | brrrrrrrrrr |
+-------------------+------------------+
case-insensitive | Windows | some unfortunate |
| MacOSX HFS+ | network schemes |
| Cygwin | |
+-------------------+------------------+
In the upper left box, if you create "fiLe" it's stored as "fiLe", and only
open("fiLe") will open it (open("file") will not, nor will the 14 other
variations on that theme).
In the lower right box, if you create "fiLe", there's no telling what it's
stored as-- but most likely as "FILE" --and any of the 16 obvious variations
on open("FilE") will open it.
The lower left box is a mix: creating "fiLe" stores "fiLe" in the platform
directory, but you don't have to match case when opening it; any of the 16
obvious variations on open("FILe") work.
NONE OF THAT IS CHANGING! Python will continue to follow platform
conventions wrt whether case is preserved when creating a file, and wrt
whether open() requires a case-sensitive match. In practice, you should
always code as if matches were case-sensitive, else your program won't be
portable. But then you should also always open binary files with the "b"
flag, and you don't do that either <wink>.
What's proposed is to change the semantics of Python "import" statements,
and there *only* in the lower left box.
Support for MaxOSX HFS+, and for Cygwin, is new in 2.1, so nothing is
changing there. What's changing is Windows behavior. Here are the current
rules for import on Windows:
1. Despite that the filesystem is case-insensitive, Python insists on
a case-sensitive match. But not in the way the upper left box works:
if you have two files, FiLe.py and file.py on sys.path, and do
import file
then if Python finds FiLe.py first, it raises a NameError. It does
*not* go on to find file.py; indeed, it's impossible to import any
but the first case-insensitive match on sys.path, and then only if
case matches exactly in the first case-insensitive match.
2. An ugly exception: if the first case-insensitive match on sys.path
is for a file whose name is entirely in upper case (FILE.PY or
FILE.PYC or FILE.PYO), then the import silently grabs that, no matter
what mixture of case was used in the import statement. This is
apparently to cater to miserable old filesystems that really fit in
the lower right box. But this exception is unique to Windows, for
reasons that may or may not exist <frown>.
3. And another exception: if the envar PYTHONCASEOK exists, Python
silently grabs the first case-insensitive match of any kind.
So these Windows rules are pretty complicated, and neither match the Unix
rules nor provide semantics natural for the native filesystem. That makes
them hard to explain to Unix *or* Windows users. Nevertheless, they've
worked fine for years, and in isolation there's no compelling reason to
change them.
However, that was before the MacOSX HFS+ and Cygwin ports arrived. They
also have case-preserving case-insensitive filesystems, but the people doing
the ports despised the Windows rules. Indeed, a patch to make HFS+ act like
Unix for imports got past a reviewer and into the code base, which
incidentally made Cygwin also act like Unix (but this met the unbounded
approval of the Cygwin folks, so they sure didn't complain -- they had
patches of their own pending to do this, but the reviewer for those balked).
At a higher level, we want to keep Python consistent, and I in particular
want Python to do the same thing on *all* platforms with case-preserving
case-insensitive filesystems. Guido too, but he's so sick of this argument
don't ask him to confirm that <0.9 wink>.
The proposed new semantics for the lower left box:
A. If the PYTHONCASEOK envar exists, same as before: silently accept
the first case-insensitive match of any kind; raise ImportError if
none found.
B. Else search sys.path for the first case-sensitive match; raise
ImportError if none found.
#B is the same rule as is used on Unix, so this will improve cross-platform
portability. That's good. #B is also the rule the Mac and Cygwin folks
want (and wanted enough to implement themselves, multiple times, which is a
powerful argument in PythonLand). It can't cause any existing
non-exceptional Windows import to fail, because any existing non-exceptional
Windows import finds a case-sensitive match first in the path -- and it
still will. An exceptional Windows import currently blows up with a
NameError or ImportError, in which latter case it still will, or in which
former case will continue searching, and either succeed or blow up with an
ImportError.
#A is needed to cater to case-destroying filesystems mounted on Windows, and
*may* also be used by people so enamored of "natural" Windows behavior that
they're willing to set an envar to get it. That's their problem <wink>. I
don't intend to implement #A for Unix too, but that's just because I'm not
clear on how I *could* do so efficiently (I'm not going to slow imports
under Unix just for theoretical purity).
The potential damage is here: #2 (matching on ALLCAPS.PY) is proposed to be
dropped. Case-destroying filesystems are a vanishing breed, and support for
them is ugly. We're already supporting (and will continue to support)
PYTHONCASEOK for their benefit, but they don't deserve multiple hacks in
2001.
Flame at will.
or-flame-at-tim-your-choice-ly y'rs - tim