
Guido van Rossum wrote:
Let's first complete the requirements gathering.
Yes.
Are these requirements reasonable? Will they make an implementation too complex?
I think you can get 90% of where you want to be with something much simpler. And the simpler implementation will be useful in the 100% solution, so it is not wasted time. How about if we just design a Python archive file format; provide code in the core (in Python or C) to import from it; provide a Python program to create archive files; and provide a Standard Directory to put archives in so they can be found quickly. For extensibility and control, we add functions to the imp module. Detailed comments follow:
Compatibility issues: --------------------- [list of current features...]
Easily met by keeping the current C code.
New features: -------------
- Integrated support for Greg Ward's distribution utilities (i.e. a module prepared by the distutil tools should install painlessly)
- Good support for prospective authors of "all-in-one" packaging tool authors like Gordon McMillan's win32 installer or /F's squish. (But I *don't* require backwards compatibility for existing tools.)
These tools go well beyond just an archive file format, but hopefully a file format will help. Greg and Gordon should be able to control the format so it meets their needs. We need a standard format.
- Standard import from zip or jar files, in two ways:
(1) an entry on sys.path can be a zip/jar file instead of a directory; its contents will be searched for modules or packages
(2) a file in a directory that's on sys.path can be a zip/jar file; its contents will be considered as a package (note that this is different from (1)!)
I don't like sys.path at all. It is currently part of the problem. I suggest that archive files MUST be put into a known directory. On Windows this is the directory of the executable, sys.executable. On Unix this $PREFIX plus version, namely "%s/lib/python%s/" % (sys.prefix, sys.version[0:3]). Other platforms can have different rules. We should also have the ability to append archive files to the executable or a shared library assuming the OS allows this (Windows and Linux do allow it). This is the first location searched, nails the archive to the interpreter, insulates us from an erroneous sys.path, and enables single-file Python programs.
I don't particularly care about supporting all zip compression schemes; if Java gets away with only supporting gzip compression in jar files, so can we.
We don't need compression. The whole ./Lib is 1.2 Meg, and if we compress it to zero we save a Meg. Irrelevant. Installers provide compression anyway so when Python programs are shipped, they will be compressed then. Problems are that Python does not ship with compression, we will have to add it, we will have to support it and its current method of compression forever, and it adds complexity.
- Easy ways to subclass or augment the import mechanism along different dimensions. For example, while none of the following features should be part of the core implementation, it should be easy to add any or all:
[ List of new features including hooks...]
Sigh, this proposal does not provide for this. It seems like a job for imputil. But if the file format and import code is available from the imp module, it can be used as part of the solution.
- support for a new compression scheme to the zip importer
I guess compression should be easy to add if Python ships with a compression module.
- a cache for file locations in directories/archives, to improve startup time
If the Python library is available as an archive, I think startup will be greatly improved anyway.
Implementation: ---------------
- There must clearly be some code in C that can import certain essential modules (to solve the chicken-or-egg problem), but I don't mind if the majority of the implementation is written in Python. Using Python makes it easy to subclass.
Yes.
- In order to support importing from zip/jar files using compression, we'd at least need the zlib extension module and hence libz itself, which may not be available everywhere.
That's a good reason to omit compression. At least for now.
- I suppose that the bootstrap is solved using a mechanism very similar to what freeze currently used (other solutions seem to be platform dependent).
Yes, except that we need to be careful to preserve the freeze feature for users. We don't want to take it over.
- I also want to still support importing *everything* from the filesystem, if only for development. (It's hard enough to deal with the fact that exceptions.py is needed during Py_Initialize(); I want to be able to hack on the import code written in Python without having to rebuild the executable all the time.
Yes, we need a function in imp to turn archives off: import imp imp.archiveEnable(0)
Finally, to what extent does this impact the desire for dealing differently with the Python bytecode compiler (e.g. supporting optimizers written in Python)? And does it affect the desire to implement the read-eval-print loop (the >>> prompt) in Python?
I don't think it impacts these at all. Jim Ahlstrom