Fred L. Drake wrote:
M.-A. Lemburg writes:
Note that this kind of search will be very costly due the amount of IO needed to search the path. Some sort of fastpath hook
Marc-Andre, Why does this need to be so costly? Compared to the current scheme, there's little to add. Once a package has been identified (and *only* then!), search the path for all the appropriate subdirectories (one stat() for each path entry). The current approach requires about a half dozen stats for each path entry: foo.py, foo.py[co], foomodule.so, foo.so, foo/ + foo/__init__.py + foo.__init__.py[co]. It will typically be even cheaper for sub-packages, because the original path will usually be much shorter than sys.path.
Well, I was referring to the additional lookup needed to find the next package dir of the same name. Say you put the Python package into site-packages and the binaries into plat-<platform>. Since the platform subdirs come first on the standard sys.path, all imports of the form
will first look in the binary package, fail and then continue to look (and hopefully find) the MyModule submodule in the Python package installed under site-packages. Since these imports are more common than importing binaries, imports would get even slower on average.
Ok, you could change the sys.path so that the binaries come *after* the source packages... but it's currently not the default.
Note that I'm not saying there shouldn't be some sort of directory caching; loading Grail is still dog slow, and I've no doubt that the 600+ stat() calls contribute to that! 1-)
I would very much like to see some sort of caching in the interpreter. The fastpath hook I implemented uses a marshalled dict stored in the user's home dir for the lookup. Once created, it reduces startup time noticeably (cutting down stat() calls from around 200 for a typical utility script to around 20).
The nice thing about the hack is that you can experiment with the cache logic using Python functions before possibly coding it in C.