New subject: PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

July 3, 2009

      This post basically follows on from the previous thread about PEP 376
(Changing the .egg-info structure) where it was pointed out that the
PEP doesn't cover PEP 302 import hooks (except in the explicit special
case of zip files). This is likely to be a fairly long posting, but
I'd like to try and cover the subtleties a little, so that people can
comment without having to refer back to too many documents.

Comments are most definitely welcome! While I know PEP 302 reasonably
well, and the zip importer implementation, I'm not an expert on the
egg-info structure, or the practical drivers behind it, so if I miss
key issues because of a too-theoretical approach, I'd be grateful for
corrections.

The basic structure of PEP 302 imports is as follows:

  - scan sys.meta_path checking each finder in turn
  - if nothing found, scan sys.path and for each entry pass it to each
element of sys.path_hooks, to get a finder to check
  - do the default filesystem processing on path items for which no
path hook applies

The first finder (if any) to recognise the module name wins (and
returns a loader responsible for creating the actual module object).

So it's the finders that are responsible for scanning the "filesystem"
(more accurately, the namespace that finder manages).

PEP 302 is entirely couched in terms of modules, packages and module
names. There is no concept of a "distribution" in PEP 376 terms. This
is entirely understandable, as imports don't know about distributions
(see the docutils example in PEP 376 - the docutils *distribution*
contains the *module* roman, but there's nothing about importing roman
that requires you to know that it came from the docutils
distribution).

So we need to introduce a new concept, that of a distribution, into
PEP 302 loaders. And the place it should be located is in the finder
(which handles the "filesystem" aspects of the protocol. So we have
the methods:

Finder.list_distributions()
Returns a list of all distribution names in the "filesystem" managed
by the finder (usually one zip file, path directory, salite database,
or whatever)

Finder.get_metadata(distribution_name)
Returns a metadata object for the given distribution (this is the PEP
376 Distribution object). One note here - get_egginfo_file should be
specified as returning a *file-like object* rather than a file
instance. Another note - get_egginfo_file and get_egginfo_files could
just as easily be named get_metadata_file and get_metadata_files -
just as meaningful, and less tied to the egg format (as well as
avoiding the "egg" name, which I personally dislike :-))

Finder.uninstall(distribution_name, filter=callable, installer=name)
Uninstall the given distribution. It's likely that many finders will
be read-only. In that case, this function should return None.
Otherwise, return a list of the "files" removed. (This may need some
clarification, as many finders won't have a concept of a "file name").

I don't think anything else is needed to support PEP 376.

The prototype implementation of PEP 376 could be extended to work with
finders (doing the relevant sys.meta_path and sys.path_hooks
searches). For the final implementation, the special casing of zip
files would be replaced by an implementation of the extended finder
protocol in the zipimporter object.

There's almost certainly aspects missing from the above proposal. But
it does have some definite advantages, above and beyond simply
allowing PEP 302 importers to participate in the PEP 376 protocol.
Setuptools-style eggs could be handled simply by creating a
specialised finder (IIUC, they currently just use the standard
zipimporter - the specialised version could subclass this) to override
the metadata methods so as to cater for their specialised egg-info
format. Other formats could be handled similarly.

Does this sound sensible? Tarek, would you be OK with me attempting to
modify your prototype to support this protocol? Are there any tests
for PEP 376, so that I can confirm I haven't completely broken
something? If I can, I'll knock up some simple prototype importers for
non-standard examples, and see how they work with all this.

Paul.

PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

Paul Moore

Tarek Ziadé

Nick Coghlan

Brett Cannon

Paul Moore

Nick Coghlan

Brett Cannon

Tarek Ziadé

Paul Moore

Paul Moore

Tarek Ziadé

Paul Moore

Tarek Ziadé

Paul Moore

Tarek Ziadé

Paul Moore

Tarek Ziadé

Tarek Ziadé

Nick Coghlan

Brett Cannon

Paul Moore

Nick Coghlan

Brett Cannon

Tarek Ziadé

Paul Moore

Paul Moore

Tarek Ziadé

Paul Moore

Tarek Ziadé

Paul Moore

Tarek Ziadé

Paul Moore

Tarek Ziadé

tags

participants (4)