On Tue, Nov 13, 2012 at 4:21 AM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
Daniel Holth <dholth <at> gmail.com> writes:

> David, did you mention a paper about advanced dependency resolution
> algorithms?I don't know how dependency resolution should work. I only claim
> that the very popular distribute needs to provide setuptools to work; right now
> it does that with a hack by including a setuptools .egg-info directory.

Do you mean pip? distribute is an incarnation of setuptools, after all.

I meant distribute.

If distribute did not "Provides-Dist" setuptools (although it does it in its own way, because no one has actually implemented Provides-Dist), then all the packages that depend on setuptools would try to install setuptools in addition to distribute.
 
> I don't expect provides-dist to be a very widely used feature at all.As for
> Provides-Dist you should just index that field locally and the remote package
> index should let you search by provides instead of by the package name (in
> that index the package name is one of the provides values). You are searching
> the entire metadata, it's just already indexed so it's efficient.

So if several different distributions on PyPI say that they "provide"
"Foo (1.0)", which one are you supposed to pick? The question of efficiency
isn't the main concern here - it's "what do you do with the (no doubt
efficiently-returned) results?"

You would ask the user or prefer the package that is actually named Foo by default.
 
> In Python/pypi, which is mostly libraries and not applications like in Debian,
> a fork would be the normal use case for provides-dist. If the plugin systems
> were more widely used then you might have more non-fork provides-dist lines,
> for example if trac required at least one revision control backend.

Linux distros, Debian included, ship lots of libraries too. I get most of the
commonly-used libraries (like setuptools, PIL) through the distro package
manager.

Forks don't need a *multi-valued* Provides field. And ISTM in the comment about
Trac, you are referring to what pip calls bundles - which, IIUC, is a
deprecated feature of pip. With a good repository infrastructure and packaging
tools, I'm not sure why bundles would be needed.

Often, packages which bundle other packages don't advertise those other
packages as being provided (and rightly so, in my view). For example, Django
includes simplejson, six etc. but that doesn't need to be exposed at the
PyPI level. It can be considered as implementation detail. The bundled packages
could even diverge from their non-bundled counterparts to better serve the
needs of the "main" package.

That's usually called vendorizing. The vendorized packages would have a different name "from django.utils import six". The package index and installer don't need to know.

With trac, a Python app with a good plugin system, I meant it might need at least one package that provides a virtual package "trac-revision-control-backend". Of course trac doesn't actually do this, it supports SVN out of the box and supports other RCS with plugins.

Pillow is a great example. It might provide both PIL and Imaging and of course it provides its own name Pillow. If you allow Provides-Dist then the package already has two names. Once you have 2 all the complexity is already there, so you might as well allow n based on the 0, 1 or infinity rule. http://www.catb.org/jargon/html/Z/Zero-One-Infinity-Rule.html

a cavallo says:

Maybe a sat solver is what you're looking for... libzypp (suse) does implement
that for this purpose. A python modules is under the rox project.