
2009/7/3 Paul Moore <p.f.moore@gmail.com>:
This is a good point. Distutils only installs files in the filesystem - it has no facilities for installing packages based on any other sort of PEP 302 based importers. Hence, PEP 376 in principle should only relate to filesystem-based distributions. But it also mentions zipfile-based distributions: "Notice that the API is organized in five classes that work with directories and Zip files (so it works with files included in Zip files, see PEP 273 for more details [8])."
This is wrong. The PEP should either (1) restrict itself to filesystem based implementations (leaving the problem of other PEP 302 loaders to systems that manage these) or (2) be defined in a sufficiently general way that it can be implemented for *any* PEP 302 based loader - which probably means extending the PEP 302 protocols - and supplying zipfile functions as an example of how this is used.
I believe that (1) is unlikely to be sufficient for real world use. Zip files (eggs, py2exe embedded modules, etc) are far too important a real world use case to ignore. The problem with (2) is that it requires significant extra work. But special-casing zip files (as the present implementation appears to do) will break as soon as any other PEP 302 compliant format becomes popular.
Moreover, the proposed ``egginfo_dirname()`` routine is a step-back from the ``pkg_resources`` approach where we don't enforce resources to reside on a traditional filesystem.
On the other hand, pkgutil.get_data is the standard library means of reading resources from a package. This is PEP 302 compliant now. This new PEP doesn't affect that.
Right. While it would be feasible to make pgutil works with PEP 302 loaders, we would still need to define in a generic way the content of the RECORD files. Right now it works for directory and zipped files since it's expressed with '/' separated paths. And if I understand PEP 302 right, any backend would be able to handle these paths no matter how they are stored, as long as the implement APIs like get_data()
What PEP 302 doesn't provide is package management. But Python itself doesn't provide package management, except in the form of distutils setup.py install - which is solely filesystem based.
Maybe there's a case for extending PEP 302 and distutils to allow integrated management of other forms of importer format, but that's a huge other project, which no-one to my knowledge is even looking at.
Sounds like a fully-featured packaging managment system, which is imho, out of scope. And I don't see PEP 376 making it impossible for someone to build such a packaging system on the top of distutils. I've started one myself for the sake of experimentation, with built-in multiversion support, for a full replacement of site-packages.
Eggs are fundamentally a PEP 302 zip file format. There are some extra bits of metadata for setuptools/easy_install in there (as I understand things) but essentially they are zip files. When you say "decoupling the egg format", I assume you mean "decoupling the egg metadata" - which is fine, but to properly decouple, you need API level access to the metadata. PEP 376 offers read-only access, but as you rightly point out, it is only for filesystem data (and some form of zip file, which appears to be limited in some way, as it isn't PEP 302 based, and the actual format isn't defined anywhere).
And also PEP 376 goal is to define a single standard location of egg-info files for filesystem data. The zip form was built so it could work with zipped site-packages directories, like what the py2app project uses.
The basic point here is that PEP 376 needs to define precisely how pkgutil.get_distributions() scans sys.path looking for ".egg-info directories". What does it do for sys.path entries that don't correspond to filesystem directories? (Note - these may or may not be zip files. Even if they are zip files, an earlier entry on sys.path_hooks could have taken precedence. At the very least, you should only process path entries as zip files if their importer - in sys.path_importer_cache or via an explicit path hook scan - is a zipimporter object.).
I'll add more details on that part. right now it visits directories and zip files. Tarek