[Distutils] Installing dependencies

Sun Jun 26 00:04:54 CEST 2005

I don't quite know yet how EasyInstall should handle dependencies.  I do 
know that I want it to use the metadata from a built egg to determine those 
dependencies, and that it should also include dependencies for any "extras" 
(see my last terminology post) that were requested on the EasyInstall 
command line.

I also know that these dependencies can't really be handled by pretending 
they were included on the original command line, because that implies e.g. 
copying the package to the installation directory, and possibly munging 
.pth files.  For example, if you already have one of the dependencies 
installed in --multi-version mode, but are installing a new package in 
single-version, you don't necessarily want to switch the existing package 
to single-version, do you?

Then again, maybe you do.  After all, if you're installing in 
single-version, you probably want stuff to "just work", and resetting the 
current version of the dependency would be the right thing in that 
case.  EasyInstall should probably just be louder about the fact that it's 
changing the currently-active version.  Maybe a post-installation report 
about what versions of relevant packages are active (w/"before and after" 
version information) would address this issue.

Okay, what about the reverse scenario?  You're installing --multi-version, 
but one of the dependencies is already single-version?  That really needs 
to be left alone.  And what if there's a conflicting requirement with an 
existing single-version install?  Changing the active version might break 
something else, so we'd need to spit out a warning, explaining how to fix 
it (by changing the active version of the dependencies).

Finally, EasyInstall currently always ensures there is a copy of the 
desired distribution in the installation directory.  So, if you are 
installing somewhere other than site-packages, but your dependencies can be 
met using distributions in site-packages, should we still copy the 
distributions?  I think it's always a safe default to do the copying, 
because disk space is cheap, and it makes the installation more independent 
of the Python installation and environment used to create it.

However, I can also see that sometimes you might want to avoid this, but 
I'm not really sure what the option should be called. --no-copy, 
perhaps?  (meaning, don't copy distributions found on sys.path to the 
--install-dir.)

On a related note, it seems to me that setuptools can and maybe should 
change its 'install' command to actually use EasyInstall, such that running 
'setup.py install' for a package using setuptools, does the same thing as 
it would have if you'd used EasyInstall to install it.

This might force some people to use 'require()' who otherwise would not 
have, but I'm not sure how much of a problem that is in practice.  Really, 
if Python 2.5 ends up supporting .pth files in any sys.path directory, the 
problem will go away because EasyInstall could avoid forcing 
--multi-version when a custom installation directory is used.

Anyway, aside from the easy package management and other features of using 
easy_install as the 'install' command, it would also enable the traditional 
'setup.py install' to include the distributed package's dependencies as 
well.  This means that we could finally make it easy for people to 
distribute packages with external dependencies, because the only thing 
they'll have to include is the 'ez_setup.py' bootstrap script alongside 
their setup.py.

The principal problem I see with this is that some distutils commands 
internally access the 'install' command in order to get information or to 
perform a pseudo-installation for creating binary distributions.  So, there 
might be some interesting technical hurdles in order to make it such that 
running 'install' from the command line does one thing, but 'install' 
called internally from a 'bdist' command does something else.  I'll have to 
investigate that a bit.

It would also be nice if you could specify your dependencies as part of the 
'setup()' script, but it looks like some people are trying to use the 
'requires' keyword for PEP 314's non-semantic metadata.  Personally I think 
PEP 314 is going in the wrong direction at the moment, because it doesn't 
specify any semantics for the metadata.  Having more fields for people to 
fill out with data we can't use isn't very helpful.  And, at the same time, 
the PEP 314 spec is too strict about how versions are defined, and doesn't 
support "extra" (optional) requirements.  I should probably make a concrete 
proposal for revising PEP 314 to:

1. Define semantics for "Requires" strings (PyPI project names)
2. Suggest removing "Provides", in favor of distributing only a single 
"provided" thing per distribution at the present time.
3. Suggest renaming "Obsoletes" to "Conflicts"
4. Define a package version syntax based on setuptools' rules
5. Allow Download-URL to be a link to an HTML page containing direct links 
to downloadable distributions, so that PyPI and Sourceforge download pages 
are acceptable (note that PEP 314 doesn't currently let you publish links 
to binary distributions, only source).

What I'd like to do with setuptools is allow you to use keywords like the 
following, in order to define your distribution's required dependencies and 
any optional extra features that incur additional dependencies:

setup(
     ...
     requires = ["SomePackage>=1.2", "OtherPackage"],
     extras = dict(
         reST = ["docutils>=0.3.5", "restEdit>=0.3"],
         XML = ["xmlplus"],
         ...
     )
     ...
)

But that won't work too well if the 'requires' conflicts with what PEP 314 
is after, and the current Python CVS trunk implements.