Compiling Python 1.6 under MacOS X ...

Tim Peters tim_one at
Sun Sep 24 08:16:26 CEST 2000

[Tim said]
> For every pair of distinct files X and Y in the distribution, if X and
> Y are in the same directory, the names of X and Y differ by more than
> just case.  Or, if that's not true, it's a Unix(tm) bug that doesn't
> affect Windows <wink>.

I haven't posted any code in a while, so took this as an excuse to stop
reading licenses and write a little one-shot utility <0.5 wink>.  I hope it
shows some useful tricks, and appropriate use of some new 1.6 and 2.0
features.  A couple subtleties to note:

1. In the one place it made sense, I used "filter" instead of listcomps,
because "filter" was simply clearer in this specific case.

2. I almost always use this kind of scheme for crawling over a directory
tree, rather than a recursive scheme or os.path.walk.  Manipulating an
explicit list of directories yet to be visited is very flexible, easy to
write, and allows (as in the example) major optimizations specific to the
task at hand.  Maybe it's just me, but I can also recreate this scheme from
scratch much faster than I can ever remember how to *use* os.path.walk!
Many "frameworks" in Python suffer a similar fate -- it's so easy to do it
"by hand" they never get used.

The attached checked the entire active CVS repository for files X and Y that
would put the lie to the claim above, but didn't find any:

0 hits in 2322 files across 132 CVS/Entries.

Since I was running this on Windows, I looked at the CVS/Entries files
rather than my directory tree, because the latter would have gotten hosed if
there *were* any case-insensitive matches.

windows-may-lie-but-cvs-doesn't-ly y'rs  - tim

def crawl(root):
    from os.path import join, exists, isdir
    from os import listdir, chdir
    dirs = [root]
    while dirs:
        dir = dirs.pop()
        # This seems a little-known trick in Python:  changing to the
        # current directory allows the subsequent isdir-in-a-loop to
        # work on just the file name, instead of making the OS reparse
        # the whole darn path from the root again each time.
        # Depending on the OS, can save gobs of time.
        for f in filter(isdir, listdir(dir)):
            path = join(dir, f)
            if f == "CVS":
                entries = join(path, "Entries")
                if exists(entries):
                    raise TypeError("CVS dir w/o Entries!! " + path)

def process(path):
    global nhits, ndirs, nfiles
    print "Looking at", path
    ndirs += 1
    names = {}
    f = open(path)
    for line in f.readlines():
        if line.startswith("/"):
            offset = 1
        elif line.startswith("D/"):
            offset = 2
        elif line == "D\n":
            # not sure what's going on with this one!
            raise TypeError(("unexpected line in",
        nfiles += 1
        name = line[offset:line.find("/", offset)].lower()
        if names.has_key(name):
            print "**********", "HIT ON", line
            hits += 1
            names[name] = 1

nhits = ndirs = nfiles = 0
print nhits, "hits in", nfiles, "files across", ndirs, "CVS/Entries."

More information about the Python-list mailing list