pickle and module package
Michael P. Reilly
arcege at shore.net
Wed May 19 12:16:52 EDT 1999
Fred L. Drake <fdrake at cnri.reston.va.us> wrote:
: M.-A. Lemburg writes:
: > I'd say, there's no way for the import logic to tell whether
: > you are about to import the same module a second time... unless
: > maybe, if it scans the sys.modules dict for filenames of the modules
: > and then checks for identical files. But that would reduce import
: > performance dramatically and not be worth it.
: A dictionary could be used that maps filenames to modules; this can
: simply be checked and updated during the slow path through import.
: This shouldn't be much slower than it already is. ;-)
: The filesnames would have to be absolute for it to work; there's
: currently nothing that does this for pathnames in the core
: interpreter. I'd be quite happy if __file__ could be relied on to be
: absolute as well!
I pretty much agree, but not to get too pedentic, I'll bring up a
potential problem.
Attempting to figure out absolute pathnames can be difficult. Here is
just one commonly known example from the UNIX world dealing with
Automounter V1 (I try not to know much about M$ platforms, but I can
think of some examples there too):
Automounter (version 1) is a very useful tool that mounts NFS drives
on demand without the need for root access. As part of the
implimentation, drives are mounted under one directory structure
(usually /tmp_mnt) and accessed through another (the map) via
symbolic links (but the mechanism only works through this map). For
example, a user's home directory is /home/luser which is mounted on
/tmp_mnt/home/luser. Performing a chdir("../..") will not bring the
user to "/", but will bring him to "/tmp_mnt".
On top of this, after a period of time, the system will attempt to
unmount the drive. If a processes accesses "/home/luser", the drive
will be remounted, if a processes instead accesses
"/tmp_mnt/home/luser" then the drive is not remounted and the system
reports a failure.
The reason I bring this up is because the os.getcwd() call (on UNIX)
performs the basic 'traverse ".." to figure out where I am' algorithm.
This could lead to problems:
sys.path contains "/home/luser"
sys.module_cache contains "/tmp_mnt/home/luser"
I'm not saying that this is not a good idea, just that it must be handled
carefully. Different systems will have different problems with this
("aliases" on Macs, mounted drives on Windows, odd remote filesystem
drivers on UNIX).
Overall, I don't see that a module filename cache would be a
performance hit. I think that it would be better to use the cache at
the C level from within the import.c module (for thread locking
purposes at the very least), with access in the imp module?
-Arcege
More information about the Python-list
mailing list