[Distutils] Questions about Python Eggs

Phillip J. Eby pje at telecommunity.com
Mon May 23 04:32:39 CEST 2005


At 11:17 PM 5/18/2005 -0500, Ian Bicking wrote:
>I notice the comment on pkg_resources.require aren't very confident ;)
>It actually doesn't look functional to me, though I haven't tried
>running it.  Is it just meant to raise an ImportError when a requirement
>isn't met, or can it search some directories for appropriate Egg files?
>   This is something I'm very interested in, so if I have the intention
>correct I'd like to help move this function along.

Just a quick followup on this; I've just checked in a version of 
pkg_resources whose 'require()' API is only missing a working 
'find_distributions(path_item)' function.  If you'd like to experiment with 
this, you can try implementing your own 'find_distributions' and 
monkeypatching it into 'pkg_resources'.

The routine should take a sys.path entry (i.e. a string) and yield zero or 
more pkg_resource.Distribution instances representing distributions found 
in the supplied directory, zipfile, or whatever.

The easiest way to create these Distribution instances is using the 
Distribution.from_filename constructor, which takes care of figuring out 
the distribution's platform, name, version, Python version, etc. from the 
full path to the file.

You'll also need to supply a 'metadata' object to each Distribution, so it 
can find its dependency list.  In the setuptools.tests.test_resources 
module there's a mock Metadata implementation you can use; it expects to 
receive filenames (like 'depends.txt') and return the contents of the 
corresponding metadata file (from either the .egg file's EGG-INFO 
directory, or from an unpacked PackageName.egg-info directory).

So anyway, the dependency resolution subsystem is now basically working, 
it's just that it currently lacks the ability to actually scan for .egg 
files and .egg-info directories and get their metadata.  It'll raise 
DistributionNotFound if you try to 'require()' anything, because of this 
current lack of scanning ability.

The full implementation of 'find_distributions(path_item)' will tie into 
the PEP 302 import framework, so that sys.path entries representing zip 
files will also be usable.  This is important because if a '.egg' file is 
manually placed on sys.path (e.g. via PYTHONPATH), the dependency system 
still needs to know about it.  Thus, calling 
'find_distributions("/path/to/an.egg")' should yield a Distribution object 
for the egg, whose metadata comes from the egg file's "EGG-INFO" 
directory.  Calling 'find_distributions("/some/dir")' should yield a 
Distribution for each .egg file in the directory, and for each .egg-info 
subdirectory.  The main difference between the two is that 'path_item' is 
the Distribution.path for an .egg-info, whereas the .egg file's absolute 
path is its Distribution.path.  In other words, if you find an .egg-info 
directory, it's because the egg in question is *already* on sys.path, but 
if you find a .egg file in the path_item directory, it's an egg that's not 
(necessarily) on sys.path.  (Of course, if path_item is the path to an .egg 
file, then it is of course already on sys.path.)

Whew.  Anyway, I probably won't get around to writing the full, correct, 
PEP 302-integrating 'find_distributions()' until next weekend, so if you 
want to experiment with 'require()' in the meantime you might take a whack 
at implementing a simpler version of 'find_distributions()', if you're 
interested.



More information about the Distutils-SIG mailing list