Fred L. Drake wrote:
M.-A. Lemburg writes:
This seems like something to worry about and probably also enough to try really hard to find a good solution, IMHO.
This is where a good caching system makes a lot of sense.
True, that's why the hook allows you to code the strategy in Python. Note that my current version uses the sys.path as key into a table of name:file mappings, so even when using different setups (which will certainly have some differences in sys.path), the cache should work. Maybe one should add some more information to the key... like the platform specifica or the even the mtimes of the directories on the path.
I'm not sure that keying on sys.path is sufficient. Around here, a Solaris/SPARC and Solaris/x86 box are likely to share the same sys.path. That doesn't mean the directories are the same; the differences are taken care of via NFS. Using the mtimes as part of the key means you don't have any way to clear the cache: an older mtime may just mean the version of the path for a different platform, which still wants to use the cache! Perhaps it could be keyed on (platform, dir), and the mtimes could be used to determine the need to refresh that directory. Doing this right is hard, and can be substantially affected by a site's filesystem layout. Avoiding problems due to issues like these is a good reason to use a runtime-only cache. A site for which this isn't sufficient can the use the "hook" mechanism to install something that can do better within the context of specific filesystem management policies.
Right and that's the key point in optionally moving (at least) the lookup machinery into Python. Admins could then use site.py to add optimized lookup cache implementations for their site.
The default implementation should probably be some sort of dynamic cache like the one you sketched below.
Yep, remember that too. The problem with these scans is that directories may contain huge amounts of files and you would need to check all of them against the module extensions Python
They probably won't contain much other than Python modules in a reasonable installation. There's no need to filter the list; just include every file, and then test for the appropriate entries when attempting a specific import. This limits the up-front cost substantially.
Ok, point taken.
If we don't assume a reasonable installation (non-module files in the module dirs), it just gets slower and people have an incentive to clean up their installation. This is acceptable.
Anyway, the dynamic and static versions are both implementable using the hook, so I'd opt for going into that direction rather than hard-wiring some logic into the interpreters core.
I have no problems with using a "hook" to implement a more efficient mechanism. I just want the "standard" mechanism to be efficient, because that's the one I'll use.
The hook idea makes the implementation a little more open. Still, I think that even the "standard" lookup/caching scheme should be implemented in Python.