[Distutils] Questions about Python Eggs
Phillip J. Eby
pje at telecommunity.com
Thu May 19 14:39:37 CEST 2005
At 11:17 PM 5/18/2005 -0500, Ian Bicking wrote:
>So, I'm looking at Python Eggs, and I have some questions...
>Why does it create a Package.egg-info/ directory? It seems odd; is this
>meant to replace all the metadata arguments to setup.py? That would be
>fine, I'm happy to get rid of those arguments, but if it's just a copy
>of that data it seems odd to install it alongside the distribution files.
The Package.egg-info directory serves two main purposes:
1. It's a staging area for files to be copied to the .egg file's EGG-INFO
2. It's used to tell the dependency resolution system that an unpacked
distribution is available in the given directory.
Let's go into #1 first. .egg files contain an EGG-INFO directory, whose
purpose is to hold metadata for the distribution. This metadata always
includes a standard PKG-INFO file, currently generated from the setup.py
metadata. This metadata file is how the runtime can recognize that a
zipfile is in fact a Python Egg.
But the metadata directory also contains other files that drive the runtime
or other systems. For example, there's a 'native_libs.txt' that lists all
.pyd/.so files in the distribution, and this is used by the runtime to
ensure that all extensions are extracted at the same time.
The native_libs file is automatically generated, but you can also create an
'eager_resources.txt' file by hand, if there are other resources that
should be extracted together.
In the future, you'll also be able to include a (hand-created) dependencies
file that specifies what versions of what other distributions you depend
on. You'll create this file in the .egg-info directory, and it will be
added to the .egg file's EGG-INFO directory, and the runtime system will
Last, but not least, EGG-INFO is open to expansion for specific application
or framework metadata. For example, a future WSGI "deployment descriptor"
file might be placed in EGG-INFO, to allow WSGI servers to support
single-file application deployment.
Now, the second purpose of the PackageName.egg-info directory is to support
development work on a distribution that is *not* contained in a .egg file,
when the overall project uses the pkg_resource dependency resolution
system. Let's say that you are working on an application that depends on
'Twisted>=2.0', but you are also developing Twisted itself, and therefore
do not want to use your Twisted-2.0-py2.4-win32.egg file. You place the
directory containing Twisted on your PYTHONPATH -- with Twisted.egg-info in
the same directory. That is, if you add 'twisted_src' to PYTHONPATH, you
have a layout like this:
Of course, you might have other packages in that same directory, and
perhaps other PackageName.egg-info directories. Anyway, when your
application does a 'require("Twisted>=2.0")', the dependency resolution
system will see Twisted.egg-info and read the contained PKG-INFO to check
for version data. Similarly, any other requests to the pkg_resource API
that would ordinarily look for EGG-INFO files, will look for them in
Twisted.egg-info (assuming you asked for information about that particular
version of Twisted, of course).
Anyway, by default the .egg-info directory created by bdist_egg is placed
in the correct directory so that if you are developing with the source code
being distributed by your setup.py (i.e. your package source is on
PYTHONPATH), then the dependency resolution system will correctly see that
you already have MyPackage-1.5 (or whatever) available.
>I think I'm generally going to prefer non-zip installations of Eggs,
>this way I can apply it to projects that weren't written to use
>pkg_resource, and people can see the source more easily. By any chance
>does anyone have code on hand to unpack an egg appropriately? I'm sure
>it's not hard, but it's not a one-liner I'm trying to cut down on the
If you want to unpack an .egg, you need to rename EGG-INFO to
PackageName.egg-info. Other than this, you can just unpack the whole thing
to a suitable PYTHONPATH directory. Note that this means that you can
unpack to site-packages if you like; just make sure the EGG-INFO directory
is unpacked to PackageName.egg-info in the same directory. In this way,
any packages or applications using the dependency resolution API to find
the package will still work correctly.
(Of course, if you're using the dependency API, it suffices to just drop
the .egg file in site-packages without unpacking it at all, but you asked
>What's the state of depends? What is depends anyway?
It's not yet implemented. There's a parser that can parse "requirement"
specifications, whose syntax looks like:
PackageName[feature1, feature2] >=2.0, <=3.1, ==3.7, >=4.0
That is, it's a package name (and optional feature list) followed by a
comma-separated list of zero or more version comparison operators. You can
pass a requirement spec like this to the 'require()' API in order to find
the specified package.
"Features" control optional dependencies. For example, to support FastCGI,
PEAK requires the 'fcgiapp' package, but not all applications using PEAK
need FastCGI. So, in PEAK's "depends" file, I could do this:
# other absolute dependencies go here
Now, if you do 'require("PEAK[FastCGI]>=0.5a4")', then the dependency
system will look for 'fcgiapp>=0.1' as well as 'PyProtocols>=1.0a0', once
it finds PEAK-0.5a4-py2.4-win32.egg.
Anyway, the actual implementation of this isn't done yet. The current
parser draft can handle dependency specs split across lines (using \
continuation) and comments. There's also a version parser that's roughly a
cross between distutils' LooseVersion and StrictVersion, that can correctly
handle Python's own versioning system (StrictVersion doesn't support
"release candidate" versions) and a wide variety of others.
>I notice the comment on pkg_resources.require aren't very confident ;)
>It actually doesn't look functional to me, though I haven't tried
It isn't functional; I don't even know what will happen if you run
it. It's purely a placeholder sketch for me to remember the algorithms
that Bob Ippolito and I discussed, until I get a chance to actually
> Is it just meant to raise an ImportError when a requirement
>isn't met, or can it search some directories for appropriate Egg files?
It's intended to search for .egg files, as well as recognize that .egg
files (or directories containing PackageName.egg-info directories) are
already on sys.path. By default, the directories it searches are the ones
on sys.path. So, although dropping an .egg in site-packages does not make
it importable, it does make it accessible to 'require()' and the like, and
the dependency resolution system will add the desired .egg files to
sys.path upon request.
>Anyway ideas about how to apply setuptools to other people's packages?
Currently, my suggestion is to just edit their setup.py. If you need to do
it on an automated basis, perhaps you could use a patch file.
>Is it safe to do nasty monkeypatching wherein I replace distutils.setup
>with setuptools.setup? Or, more directly, anyone have code around to
>build eggs programmatically from other people's packages?
If distutils' run_setup() worked correctly, you could probably do this. Of
course, if you control your environment, you could also possibly just copy
bdist_egg.py into your Python installation's distutils/command
directory. I don't think bdist_egg has any dependencies to the rest of
setuptools that would prevent it from being used from regular distutils
So, try copying bdist_egg.py into distutils/command, and then run "setup.py
bdist_egg" and see if it works.
More information about the Distutils-SIG