[Distutils] extensions in packages
Fred L. Drake
Fred L. Drake, Jr." <fdrake@acm.org
Wed, 26 May 1999 17:54:11 -0400 (EDT)
M.-A. Lemburg writes:
> Well, I was referring to the additional lookup needed to find
> the next package dir of the same name. Say you put the Python
> package into site-packages and the binaries into plat-<platform>.
I didn't say it was free; just that the cost was insignificant
compared to the current cost.
My sys.path in an interactive interpreter contains 11 entries. If I
want to add a package with both $prefix and $exec_prefix components,
the worst case is that the directory holding the __init__.py* is the
last path entry, and the other directory is in the immediately
preceeding path entry. After the current mechanism locates the
__init__.py* file, it needs to build the __path__ for the package. It
takes 10 stat() calls to locate the additional directory. Considering
that the initial search that caused the package module to be created
took: 11 stats to see if the entries contained the appropriate
directory + 2 stats to determine that the first directory of the
package (the one that doesn't have __init__.py*) wasn't it + 36 to
determine that the first 9 directories didn't contain a matching
.so|module.so|.py|.py[co]. Plus at least one to actually find the
__init__.pyc; two if only the .py is available. (I think I followed
the code right. ;) That's 59 system calls (either stat() or open(),
the later hidden inside fdopen()). I don't the added 10 to get the
right __path__ is worth worrying about. It's the .py[co] files that
are expensive to load! Once you've created the package, sub-modules
are very cheap: you will typically have no more than two path entries
to check even once all this is in place.
I said:
> caching; loading Grail is still dog slow, and I've no doubt that the
> 600+ stat() calls contribute to that! 1-)
Oops, after following through with the math, I'd have to adjust this
to 6000 stat()/open() calls for Grail. Sorry!
And back to Marc-Andre:
> I would very much like to see some sort of caching in the
> interpreter. The fastpath hook I implemented uses a marshalled
> dict stored in the user's home dir for the lookup. Once created,
I don't think I'd store the cache; if a user's home directory is
mounted via NFS (common), then it may often be wrong if the user
actively works with a variety of hosts with different versions or
installations of Python. The benefits of a cache are greatest for
applications that import a lot of modules (like Grail!); the cache can
be built using a directory scan as each directory is searched. (I
think one of the guys from CWI did this at one point and had really
good results; Jack?)
-Fred
--
Fred L. Drake, Jr. <fdrake@acm.org>
Corporation for National Research Initiatives