[Distutils] name of the dependency problem

Justin Cappos jcappos at nyu.edu
Wed Apr 15 17:52:40 CEST 2015


I guess I should provide the code for what I've done in this space also.
Yes, the problem is NP hard, so in both cases, the system may run really
slowly.  In practice, you don't tend to have a lot of conflicts you need to
resolve during a package install though.  So I'd argue that optimizing for
the case where you have a huge number of conflicts doesn't really matter.

I've put the code for Stork up at: https://github.com/JustinCappos/stork
 It almost certainly still runs, but hasn't been used in production for
about 4 years.

The code which does the backtracking dependency resolution is in the
satisfy function here:
https://github.com/JustinCappos/stork/blob/master/python/storkdependency.py#L1004

(FYI: Stork was the very first Python program I wrote after completing the
Python tutorial, so apologies for the style and mess.  I can clean up the
code if needed, but I don't imagine Stork's code would be useful as more
than a reference.)

However, I'd just like to reiterate that it would be good to check that pip
really needs a solution to automatically resolve conflicts, whether with a
SAT solver or backtracking.  (Stork did something atypical with the way
trust and conflicts were specified, so I think it made more sense for our
user base.)  Perhaps it's worth measuring the scope and severity of the
problem first.

In the meantime, is someone planning to work on a patch to fix the conflict
detection issue in pip?

Thanks,
Justin


On Wed, Apr 15, 2015 at 11:15 AM, David Cournapeau <cournape at gmail.com>
wrote:

>
>
> On Wed, Apr 15, 2015 at 9:34 AM, Trishank Karthik Kuppusamy <tk47 at nyu.edu>
> wrote:
>
>> On 4/15/15 9:28 AM, Justin Cappos wrote:
>>
>>> Yes, it's another way to solve the problem.  Both backtracking
>>> dependency resolution and ZYpp will always find a solution.  The tradeoff
>>> is really in how they function.  ZYpp is faster if there are a lot of
>>> dependencies that conflict.  The backtracking dependency resolution used in
>>> Stork is much easier for the user to understand why it chose what it did.
>>>
>>> An aside: I'm not necessarily convinced that you need to solve this
>>> problem automatically, instead of just raising an error when it occurs.  It
>>> should be quite rare in practice and as such may not be worth the
>>> complexity to have an automatic solution for the problem.
>>>
>>>
>> ZYpp seems to assume that dependency resolution is an NP-complete problem
>> (http://www.watzmann.net/blog/2005/11/package-installation-
>> is-np-complete.html).
>>
>> I agree that we need not solve the problem just yet. It may be worthwhile
>> to inspect packages on PyPI to see which package is unsatisfiable, but I am
>> led to understand that this is difficult to do because most package
>> metadata is in setup.py (runtime information).
>>
>
> This is indeed the case. If you want to solve dependencies in a way that
> works well, you want an index that describes all your available package
> versions.
>
> While solving dependencies is indeed NP complete, they can be fairly fast
> in practice because of various specificities : each rule is generally only
> a few variables, and the rules have a specific form allowing for more
> efficient rule representation (e.g. "at most one of variable", etc...). In
> my experience, it is not more difficult than using graph-based algorithms,
> and
>
> FWIW, at Enthought, we are working on a pure python SAT solver for
> resolving dependencies, to solve some of those issues. I am actually
> hacking on it right at PyCon, we hope to have a first working version end
> of Q2, at which point it will be OSS, and reintegrated in my older project
> depsolver (https://github.com/enthought/depsolver).
>
> David
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20150415/a597816c/attachment-0001.html>


More information about the Distutils-SIG mailing list