Unexpected VersionConflict

During easy_install of an egg where two versions of pyparsing were available (1.5.2 and 1.5.6), a VersionConflict was raised: pkg_resources.VersionConflict: (pyparsing 1.5.6 (/usr/lib/python2.7/dist-packages), Requirement.parse('pyparsing==1.5.2')) This was unexpected since sys.path (via virtualenv) has version 1.5.2 before 1.5.6. And the system gets 1.5.2 from 'import pyparsing', not 1.5.6. I've traced this to the line calling _sort_dists(dists), line 801 in my copy of pkg_resources.py: def __getitem__(self,project_name): """Return a newest-to-oldest list of distributions for `project_name` """ try: return self._cache[project_name] except KeyError: project_name = project_name.lower() if project_name not in self._distmap: return [] if project_name not in self._cache: dists = self._cache[project_name] = self._distmap[project_name] _sort_dists(dists) return self._cache[project_name] The problem is that one dependent package of the egg has a requirement of 'pyparsing' while a subsequent dependent package has a requirement of 'pyparsing==1.5.2'. The intent was that by using virtualenv with a correct sys.path version 1.5.2 would be used for both requirements. Unfortunately, because of the call to _sort_dists(), the 'pyparsing' requirement is resolved to 1.5.6 by env.best_match() in WorkingSet.resolve(). Once that resolution was made, the more explicit requirement fails. Note that without the _sort_dists() call the egg loads and runs correctly, using pyparsing 1.5.2. Its not clear to me that removing the _sort_dists() call is correct in general, but it appears to be a bug that an egg which would load and run correctly reports a VersionConflict.

On Thu, Aug 8, 2013 at 3:19 PM, Townsend, Scott E. (GRC-RTM0)[Vantage Partners, LLC] <scott.e.townsend@nasa.gov> wrote:
During easy_install of an egg where two versions of pyparsing were available (1.5.2 and 1.5.6), a VersionConflict was raised:
pkg_resources.VersionConflict: (pyparsing 1.5.6 (/usr/lib/python2.7/dist-packages), Requirement.parse('pyparsing==1.5.2'))
This was unexpected since sys.path (via virtualenv) has version 1.5.2 before 1.5.6. And the system gets 1.5.2 from 'import pyparsing', not 1.5.6.
Have you tried declaring the 1.5.2 dependency from your main project? IIRC, that should make it take precedence over either of the indirect dependencies.

That does indeed fix this problem, but requiring an egg writer to interrogate all dependent packages (and their dependent packagesŠ) and then hoist the dependencies up won't be robust if those dependent packages change their requirements between the time the egg is written and the time it's loaded. It seems to me that if a requirement has no version specified, then it shouldn't have a way to cause a VersionConflict. One possible way of implementing this would be to have resolve() only check that a distribution exists if no version is specified, do not update 'best'. 'to_activate' would need to be updated with 'generic' distributions only if a requirement with a version specifier hadn't been seen. On 8/8/13 7:44 PM, "PJ Eby" <pje@telecommunity.com> wrote:
On Thu, Aug 8, 2013 at 3:19 PM, Townsend, Scott E. (GRC-RTM0)[Vantage Partners, LLC] <scott.e.townsend@nasa.gov> wrote:
During easy_install of an egg where two versions of pyparsing were available (1.5.2 and 1.5.6), a VersionConflict was raised:
pkg_resources.VersionConflict: (pyparsing 1.5.6 (/usr/lib/python2.7/dist-packages), Requirement.parse('pyparsing==1.5.2'))
This was unexpected since sys.path (via virtualenv) has version 1.5.2 before 1.5.6. And the system gets 1.5.2 from 'import pyparsing', not 1.5.6.
Have you tried declaring the 1.5.2 dependency from your main project? IIRC, that should make it take precedence over either of the indirect dependencies.

On Fri, Aug 9, 2013 at 9:04 AM, Townsend, Scott E. (GRC-RTM0)[Vantage Partners, LLC] <scott.e.townsend@nasa.gov> wrote:
That does indeed fix this problem, but requiring an egg writer to interrogate all dependent packages (and their dependent packagesŠ) and then hoist the dependencies up won't be robust if those dependent packages change their requirements between the time the egg is written and the time it's loaded.
That's why it's generally left up to the application installer/integrator to address these sorts of conflicts, and why it's usually a bad idea for anybody to be requiring exact versions. (I'd suggest asking your dependency to not specify exact point releases, too.) There is one other possibility, though: have you tried reversing the list of your project's dependencies so that the more-specific project's dependencies are processed first? (i.e., so that 1.5.2 will be selected as "best" before the non-version-specific one is used) That might fix it without requiring you to pin a version yourself.
It seems to me that if a requirement has no version specified, then it shouldn't have a way to cause a VersionConflict. One possible way of implementing this would be to have resolve() only check that a distribution exists if no version is specified, do not update 'best'. 'to_activate' would need to be updated with 'generic' distributions only if a requirement with a version specifier hadn't been seen.
Thing is, the complete lack of a version requirement is pretty rare, AFAIK, and so is the exact version match that's causing your problem. The combination existing on the same library is therefore that much rarer, so such a change would just be something of a complex kludge that wouldn't improve any other use cases. Probably a better way would be to change the version resolution algorithm to be less "greedy", and simply rule out unacceptable versions as the process goes along, then picking the most recent versions left when everything necessary has been eliminated. (Ideally, such an algorithm would still track which distributions had the conflicting requirements, though.) That would be a pretty significant change, but potentially worth someone investigating. There are some big downsides, however: * It's not really a suitable algorithm for installation tools that don't have access to a universal dependency graph, because they can't tell what the next level of dependencies will be * Recursion causes a combinatorial explosion, because what if you select a different version and it has different dependencies (recursively)? Now you need backtracking, and there's a possibility that the algorithm will take a ridiculous amount of time to still conclude that there's nothing you can do about the conflict. These drawbacks are basically why I just wrote a simple greedy match algorithm in the first place, figuring it could always be improved later if it turned out to be needed in practice. There have been occasional comments over the last decade or so by people with ideas for better algorithms, but no actual code yet, as far as I know.

On 10 August 2013 04:06, PJ Eby <pje@telecommunity.com> wrote:
Probably a better way would be to change the version resolution algorithm to be less "greedy", and simply rule out unacceptable versions as the process goes along, then picking the most recent versions left when everything necessary has been eliminated. (Ideally, such an algorithm would still track which distributions had the conflicting requirements, though.)
The part I find most surprising is the fact that pkg_resources ignores sys.path order entirely when choosing between multiple acceptable versions. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Aug 9, 2013 at 5:41 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 10 August 2013 04:06, PJ Eby <pje@telecommunity.com> wrote:
Probably a better way would be to change the version resolution algorithm to be less "greedy", and simply rule out unacceptable versions as the process goes along, then picking the most recent versions left when everything necessary has been eliminated. (Ideally, such an algorithm would still track which distributions had the conflicting requirements, though.)
The part I find most surprising is the fact that pkg_resources ignores sys.path order entirely when choosing between multiple acceptable versions.
Technically, it doesn't ignore it: if a distribution is listed in sys.path, it takes precedence over any distribution listed later, or that has to be found *in* a directory on sys.path, and will in fact cause a VersionConflict if you ask for a version spec that it doesn't match. However, where the distributions aren't listed in sys.path, but merely *found in a directory on sys.path*, then sys.path has no bearing. It would make things a lot more complicated, and not just in an "implementation is hard to explain" kind of way. (In principle, you could write an Environment subclass that had a different precedence, but I'm not sure what benefit you would gain from the added complexity. The core version resolution algorithm wouldn't be affected, though, since it delegates the "find me something I haven't already got on sys.path" operation to an Environment instance's best_match() method.)
participants (3)
-
Nick Coghlan
-
PJ Eby
-
Townsend, Scott E. (GRC-RTM0)[Vantage Partners, LLC]