[Catalog-sig] setuptools/distribute/easy_install/pkg_resource sorting algorithm

M.-A. Lemburg mal at egenix.com
Thu Mar 14 19:11:59 CET 2013

On 14.03.2013 17:39, PJ Eby wrote:
> On Thu, Mar 14, 2013 at 6:07 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>> On 12.03.2013 22:26, PJ Eby wrote:
>>> On Tue, Mar 12, 2013 at 3:59 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>>>> On 12.03.2013 19:15, M.-A. Lemburg wrote:
>>>>> I've run into a weird issue with easy_install, that I'm trying to solve:
>>>>> If I place two files named
>>>>> egenix_mxodbc_connect_client-2.0.2-py2.6.egg
>>>>> egenix-mxodbc-connect-client-2.0.2.win32-py2.6.prebuilt.zip
>>>>> into the same directory and let easy_install running on Linux
>>>>> scan this, it considers the second file for Windows as best
>>>>> match.
>>>>> Is the algorithm used for determining the best match documented
>>>>> somewhere ?
>>>>> I've had a look at the implementation, but this left me rather
>>>>> clueless.
>>>>> I thought that setuptools would prefer the .egg file over
>>>>> the prebuilt .zip file - binary files being easier to install
>>>>> than "source" files.
>>>> After some experiments, I found that the follow change
>>>> in filename (swapping platform and python version, in addition
>>>> to use '-' instead of '.) works:
>>>> egenix-mxodbc-connect-client-2.0.2-py2.6-win32.prebuilt.zip
>>>> OTOH, this one doesn't (notice the difference ?):
>>>> egenix-mxodbc-connect-client-2.0.2.py2.6-win32.prebuilt.zip
>>>> The logic behind all this looks rather fragile to me.
>>> easy_install only guarantees sane version parsing for distribution
>>> files built using setuptools' naming algorithms.  If you use
>>> distutils, it can only make guesses, because the distutils does not
>>> have a completely unambiguous file naming scheme.  And if you are
>>> naming the files by hand, God help you.  ;-)
>> The problem appears to be a bug in setuptools' package_index.py.
>> The function interpret_distro_name() creates a set of possible
>> separations of the found name into project name and version.
>> It does find the right separation, but for some reason, the
>> code using that function does not check the found project
>> names against the project name the user is trying to install,
>> but simply takes the last entry of the list returned by the
>> above function.
>> As a result, easy_install downloads and tries to install
>> project files that don't match the project name in some
>> cases.
>> Here's another example where it fails (say you're on a x64 Linux box):
>> # easy_install egenix-pyopenssl
>> As example, say it finds these distribution files:
>>     'egenix-pyopenssl-',
>>     'egenix_pyopenssl-',
>>     'egenix-pyopenssl-',
>>     'egenix-pyopenssl-',
>> It then creates different interpretations of those names, puts
>> them in a list and sorts them. Here's the end of that list:
>> egenix-pyopenssl; <<-- this would be the correct .egg file
>> egenix-pyopenssl;
>> egenix-pyopenssl;
>> egenix-pyopenssl;
>> egenix-pyopenssl-; 10.5-x86-64-prebuilt
>> egenix-pyopenssl-; 10.5-x86-64-prebuilt
>> It picks the last entry, which would be for a project called
>> "egenix-pyopenssl-" - not the one
>> the user searched.
> Actually, that's not quite true.  It's picking:
> egenix-pyopenssl;
> Because it thinks that
> '' is a higher
> version than
> It does also record the possibility you mentioned, but it doesn't pick
> that one.  The project names actually *do* have to match.

Ah, ok, that makes sense then.

Is there any way to have "<something>" sort before
"" ? (e.g. like is done for release candidates)

Ideally, I'd like to get this to work without any changes
to setuptools, even though it would of course be better
not to take stuff after a Python version marker into account
when looking for a package version (since the Python marker
is actually a new component in the file name).

> If you open a ticket on the setuptools tracker, 'll try to see if I
> can get it to recognize that strings like py2.7, macosx, ucs, and the
> like are terminators for a version number.  I don't know how
> successful I'll be, though.  Basically, those zip files are (I assume)
> bdist_dumb distributions being taken for source distributions, and
> easy_install doesn't actually support bdist_dumb files at the moment.

If you could point me to that tracker, I'll open a ticket :-)

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Mar 14 2013)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

More information about the Catalog-SIG mailing list