
[Guido]
Compatibility issues: ---------------------
- the core API may be incompatible, as long as compatibility layers can be provided in pure Python
Good idea. Question: we have keyword import, __import__, imp and PyImport_*. Which of those (if any) define the "core API"? [rexec, freeze: yes]
- load .py/.pyc/.pyo files and shared libraries from files
Shared libraries? Might that not involve some rather shady platform-specific magic? If it can be kept kosher, I'm all for it; but I'd say no if it involved, um, undocumented features.
support for packages
Absolutely. I'll just comment that the concept of package.__path__ is also affected by the next point.
- sys.path and sys.modules should still exist; sys.path might have a slightly different meaning
- $PYTHONPATH and $PYTHONHOME should still be supported
If sys.path changes meaning, should not $PYTHONPATH also?
New features: -------------
- Integrated support for Greg Ward's distribution utilities (i.e. a module prepared by the distutil tools should install painlessly)
I assume that this is mostly a matter of $PYTHONPATH and other path manipulation mechanisms?
- Good support for prospective authors of "all-in-one" packaging tool authors like Gordon McMillan's win32 installer or /F's squish. (But I *don't* require backwards compatibility for existing tools.)
I guess you've forgotten: I'm that *really* tall guy <wink>.
- Standard import from zip or jar files, in two ways:
(1) an entry on sys.path can be a zip/jar file instead of a directory; its contents will be searched for modules or packages
I don't mind this, but it depends on whether sys.path changes meaning.
(2) a file in a directory that's on sys.path can be a zip/jar file; its contents will be considered as a package (note that this is different from (1)!)
But it's affected by the same considerations (eg, do we start with filesystem names and wrap them in importers, or do we just start with importer instances / specifications for importer instances).
I don't particularly care about supporting all zip compression schemes; if Java gets away with only supporting gzip compression in jar files, so can we.
I think this is a matter of what zip compression is officially blessed. I don't mind if it's none; providing / creating zipped versions for platforms that support it is nearly trivial.
- Easy ways to subclass or augment the import mechanism along different dimensions. For example, while none of the following features should be part of the core implementation, it should be easy to add any or all:
- support for a new compression scheme to the zip importer
- support for a new archive format, e.g. tar
- a hook to import from URLs or other data sources (e.g. a "module server" imported in CORBA) (this needn't be supported through $PYTHONPATH though)
Which begs the question of the meaning of sys.path; and if it's still filesystem names, how do you get one of these in there?
- a hook that imports from compressed .py or .pyc/.pyo files
- a hook to auto-generate .py files from other filename extensions (as currently implemented by ILU)
- a cache for file locations in directories/archives, to improve startup time
- a completely different source of imported modules, e.g. for an embedded system or PalmOS (which has no traditional filesystem)
- Note that different kinds of hooks should (ideally, and within reason) properly combine, as follows: if I write a hook to recognize .spam files and automatically translate them into .py files, and you write a hook to support a new archive format, then if both hooks are installed together, it should be possible to find a .spam file in an archive and do the right thing, without any extra action. Right?
A bit of discussion: I've got 2 kinds of archives. One can contain anything & is much like a zip (and probably should be a zip). The other contains only compressed .pyc or .pyo. The latter keys contents by logical name, not filesystem name. No extensions, and when a package is imported, the code object returned is the __init__ code object, (vs returning None and letting the import mechanism come back and ask for package.__init__). When you're building an archive, you have to go thru the .py / .pyc / .pyo / is it a package / maybe compile logic anyway. Why not get it all over with, so that at runtime there's no choices to be made. Which means (for this kind of archive) that including somebody's .spam in your archive isn't a matter of a hook, but a matter of adding to the archive's build smarts.
- It should be possible to write hooks in C/C++ as well as Python
- Applications embedding Python may supply their own implementations, default search path, etc., but don't have to if they want to piggyback on an existing Python installation (even though the latter is fraught with risk, it's cheaper and easier to understand).
A way of tweaking that which will become sys.path before Py_Initialize would be *most* welcome.
Implementation: ---------------
- There must clearly be some code in C that can import certain essential modules (to solve the chicken-or-egg problem), but I don't mind if the majority of the implementation is written in Python. Using Python makes it easy to subclass.
- In order to support importing from zip/jar files using compression, we'd at least need the zlib extension module and hence libz itself, which may not be available everywhere.
- I suppose that the bootstrap is solved using a mechanism very similar to what freeze currently used (other solutions seem to be platform dependent).
There are other possibilites here, but I have only half- formulated ideas at the moment. The critical part for embedding is to be able to *completely* control all path related logic.
- I also want to still support importing *everything* from the filesystem, if only for development. (It's hard enough to deal with the fact that exceptions.py is needed during Py_Initialize(); I want to be able to hack on the import code written in Python without having to rebuild the executable all the time.
Let's first complete the requirements gathering. Are these requirements reasonable? Will they make an implementation too complex? Am I missing anything?
I'll summarize as follows: 1) What "sys.path" means (and how it's construction can be manipulated) is critical. 2) See 1.
Finally, to what extent does this impact the desire for dealing differently with the Python bytecode compiler (e.g. supporting optimizers written in Python)? And does it affect the desire to implement the read-eval-print loop (the >>> prompt) in Python?
I can assure you that code.py runs fine out of an archive :-). - Gordon