M.-A. Lemburg writes:
Well, I was referring to the additional lookup needed to find the next package dir of the same name. Say you put the Python package into site-packages and the binaries into plat-<platform>.
I didn't say it was free; just that the cost was insignificant compared to the current cost. My sys.path in an interactive interpreter contains 11 entries. If I want to add a package with both $prefix and $exec_prefix components, the worst case is that the directory holding the __init__.py* is the last path entry, and the other directory is in the immediately preceeding path entry. After the current mechanism locates the __init__.py* file, it needs to build the __path__ for the package. It takes 10 stat() calls to locate the additional directory. Considering that the initial search that caused the package module to be created took: 11 stats to see if the entries contained the appropriate directory + 2 stats to determine that the first directory of the package (the one that doesn't have __init__.py*) wasn't it + 36 to determine that the first 9 directories didn't contain a matching .so|module.so|.py|.py[co]. Plus at least one to actually find the __init__.pyc; two if only the .py is available. (I think I followed the code right. ;) That's 59 system calls (either stat() or open(), the later hidden inside fdopen()). I don't the added 10 to get the right __path__ is worth worrying about. It's the .py[co] files that are expensive to load! Once you've created the package, sub-modules are very cheap: you will typically have no more than two path entries to check even once all this is in place.
caching; loading Grail is still dog slow, and I've no doubt that the 600+ stat() calls contribute to that! 1-)
Oops, after following through with the math, I'd have to adjust this to 6000 stat()/open() calls for Grail. Sorry!
And back to Marc-Andre:
I would very much like to see some sort of caching in the interpreter. The fastpath hook I implemented uses a marshalled dict stored in the user's home dir for the lookup. Once created,
I don't think I'd store the cache; if a user's home directory is mounted via NFS (common), then it may often be wrong if the user actively works with a variety of hosts with different versions or installations of Python. The benefits of a cache are greatest for applications that import a lot of modules (like Grail!); the cache can be built using a directory scan as each directory is searched. (I think one of the guys from CWI did this at one point and had really good results; Jack?)
-- Fred L. Drake, Jr. firstname.lastname@example.org Corporation for National Research Initiatives