
Jim Ahlstrom wrote: [From PEP 273]
Currently, sys.path is a list of directory names as strings.
[GMcM]
A nit: that's in practice, but not by definition (a couple std modules have been patched to allow other things on sys.path). After extensive use of imputil (which puts objects on sys.path), I think we might as well make it official that sys.path is a list of strings.
Interesting. So you think imputil is wrong to put objects there? Why? (Not arguing, just interested in your experience.)
[Subdirectory Equivalence]
This is a bit misleading - it can be read as implying that the importable modules are in a flat namespace. They're not. To get R.Q.modfoo imported, R.__init__ and R.Q.__init__ must be imported. The __init__ modules have the opportunity to play with __path__ to break the equivalence of R.Q.modfoo to R/Q/modfoo.py. Or (more likely), play games with attributes so that Q is, say, an instance, or maybe a module imported from someplace else.
Question: if Archive.zip/Q/__init__.pyc does "__path__.append('some/real/directory')", will it work?
It should.
[dynamic libs]
It might be possible to extract the dynamic module from the zip file, write it to a plain file and load it.
I think you can nail the door shut on this one. On many OSes, making dynamic libs available to a process requires settings that can only (sanely) be made before the process starts.
ANd I believe some systems require shared libraries to be owned by root.
OTOH, it's common practice these days to put dynamic libs inside packages. That needs to be dealt with at runtime, and at build time (since it breaks the expectation that you can just "zip up" the package).
Yes. This is important.
[no *.py files]
You also can't import source files *.py from a zip archive.
Apparently some Linuxes / RPM distributions don't deliver .pyc's or .pyo's. Since they're installed as root and run as some-poor-user, I'm afraid there are quite a few installations running off .py's all the time. So while it's definitely sub- optimal, I'm not sure it should be outlawed.
Heh, this is an argument for Jim's position -- it would have come out during testing that way, since the imports would fail. ;-)
[Guido]
But it would still be good if the .py files were also in the zip file, so the source can be used in tracebacks etc.
As you probably remember me saying before, I think this is a very minor nice-to-have. For one thing, you've probably done enough testing before stuffing your code into an archive that you're past the point of looking at a traceback and going "doh!". Most of the time you'll need to look at the code anyway.
Testing is for wimps. :-)
OTOH, this [more from Guido]:
A C API to get a source line from a filename might be a good idea (plus a Python API).
points the way towards something I'm very much in favor of: deferring things to that-which-did-the-importing.
Yup.
[Efficiency]
The key is the archive name from sys.path joined with the file name (including any subdirectories) within the archive.
DIfferent spellings of the same path are possible in a filesystem, but not in a dictionary. A bit of "harmless" tweaking of sys.path could render an archive unreachable.
Hm, wouldn't the archive just be opened a second time? Or do I misundestand you?
[zlib must be available at start] I'll agree, and agree with Guido that the coolest thing would be to make zlib standard.
But we'd have to make sure it's statically linked. (Fortunately, we already link it statically on Windows.)
[Booting - python.zip should be part of the generated sys.path] Agree. Nice and straightforward.
Now, from the discussion:
[restate Jim] sys.path contains /C/D/E/Archive.zip and Archive.zip contains "Q/R/modfoo.pyc" so "import Q.R.modfoo" is satisfied by /C/D/E/Archive.zip/Q/R/modfoo.pyc
[restate Finn] Jython has sys.path /C/D/E/Archive.zip!Lib and Archive.zip has "Lib/Q/R/modfoo.pyc" so "import Q.R.modfoo" is satisfied by /C/D/E/Archive.zip/Lib/Q/R/modfoo.pyc
[restate Guido] Why not use /C/D/E/Archive.zip/Lib on sys.path?
I use embedded archives. So sys.path will have an entry like: "/path/to/executable?84758" which says that the archive starts at position 84758 in the file "executable".
Anything beyond the simple approach Jim takes gets into some very URL-ish territory. That's fine by me :-).
I don't really like the idea of hacking special knowledge of zip files into import.c (which is already a specialist in filesystems). Like Finn said, if this is a deployment issue (we want zip files now, and are willing to live with strict limitations / rules to get it), then OK (as long as it supports __path__ and some way of dealing with dynamic libs in packages).
Personally, I think package support stretched import.c to it's monolithic limits and it's high time the code was refactored to make it sanely extensible.
Yes -- this has been on my TODO list for ages. But who's gonna DO it? --Guido van Rossum (home page: http://www.python.org/~guido/)