[Python-Dev] PEP 273: Import Modules from Zip Archives
Gordon McMillan
gmcm@hypernet.com
Mon, 29 Oct 2001 09:30:30 -0500
Jim Ahlstrom wrote:
[From PEP 273]
> Currently, sys.path is a list of directory names as strings.
A nit: that's in practice, but not by definition (a couple std
modules have been patched to allow other things on
sys.path). After extensive use of imputil (which puts objects
on sys.path), I think we might as well make it official that
sys.path is a list of strings.
[Subdirectory Equivalence]
This is a bit misleading - it can be read as implying that the
importable modules are in a flat namespace. They're not. To
get R.Q.modfoo imported, R.__init__ and R.Q.__init__ must
be imported. The __init__ modules have the opportunity to
play with __path__ to break the equivalence of R.Q.modfoo to
R/Q/modfoo.py. Or (more likely), play games with attributes
so that Q is, say, an instance, or maybe a module imported
from someplace else.
Question: if Archive.zip/Q/__init__.pyc does
"__path__.append('some/real/directory')", will it work?
[dynamic libs]
> It might be possible to extract the dynamic module from the
> zip file, write it to a plain file and load it.
I think you can nail the door shut on this one. On many OSes,
making dynamic libs available to a process requires settings
that can only (sanely) be made before the process starts.
OTOH, it's common practice these days to put dynamic libs
inside packages. That needs to be dealt with at runtime, and
at build time (since it breaks the expectation that you can just
"zip up" the package).
[no *.py files]
> You also can't import source files *.py from a zip archive.
Apparently some Linuxes / RPM distributions don't deliver
.pyc's or .pyo's. Since they're installed as root and run as
some-poor-user, I'm afraid there are quite a few installations
running off .py's all the time. So while it's definitely sub-
optimal, I'm not sure it should be outlawed.
[Guido]
> But it would still be good if the .py files were also in the
> zip file, so the source can be used in tracebacks etc.
As you probably remember me saying before, I think this is a
very minor nice-to-have. For one thing, you've probably done
enough testing before stuffing your code into an archive that
you're past the point of looking at a traceback and going
"doh!". Most of the time you'll need to look at the code
anyway.
OTOH, this [more from Guido]:
> A C API to get a source line from a filename might be a good
> idea (plus a Python API).
points the way towards something I'm very much in favor of:
deferring things to that-which-did-the-importing.
[Efficiency]
> The key is the archive name from sys.path joined with the
> file name (including any subdirectories) within the archive.
DIfferent spellings of the same path are possible in a
filesystem, but not in a dictionary. A bit of "harmless"
tweaking of sys.path could render an archive unreachable.
[zlib must be available at start]
I'll agree, and agree with Guido that the coolest thing would be
to make zlib standard.
[Booting - python.zip should be part of the generated sys.path]
Agree. Nice and straightforward.
Now, from the discussion:
[restate Jim]
sys.path contains /C/D/E/Archive.zip
and Archive.zip contains "Q/R/modfoo.pyc"
so "import Q.R.modfoo" is satisfied by
/C/D/E/Archive.zip/Q/R/modfoo.pyc
[restate Finn]
Jython has sys.path /C/D/E/Archive.zip!Lib
and Archive.zip has "Lib/Q/R/modfoo.pyc"
so "import Q.R.modfoo" is satisfied by
/C/D/E/Archive.zip/Lib/Q/R/modfoo.pyc
[restate Guido]
Why not use /C/D/E/Archive.zip/Lib on sys.path?
I use embedded archives. So sys.path will have an entry like:
"/path/to/executable?84758" which says that the archive
starts at position 84758 in the file "executable".
Anything beyond the simple approach Jim takes gets into
some very URL-ish territory. That's fine by me :-).
I don't really like the idea of hacking special knowledge of zip
files into import.c (which is already a specialist in
filesystems). Like Finn said, if this is a deployment issue (we
want zip files now, and are willing to live with strict limitations /
rules to get it), then OK (as long as it supports __path__ and
some way of dealing with dynamic libs in packages).
Personally, I think package support stretched import.c to it's
monolithic limits and it's high time the code was refactored to
make it sanely extensible.
- Gordon