[Distutils] Fwd: use of '_' in package name causing version parsing issue?

Baiju M mbaiju at zeomega.com
Wed Mar 10 18:50:32 CET 2010


I was not subscribed to this list when I replied. So forwarding my own mail.

Baiju M

---------- Forwarded message ----------
From: Baiju M <mbaiju at zeomega.com>
Date: Wed, Mar 10, 2010 at 9:50 AM
Subject: Re: [Distutils] use of '_' in package name causing version
parsing issue?
To: "P.J. Eby" <pje at telecommunity.com>
Cc: Brad Allen <bradallen137 at gmail.com>, distutils-sig at python.org,
dparmar at zeomega.com, Ponnusamy A <aponnusamy at zeomega.com>,
akader at zeomega.com, Jim Fulton <jim at zope.com>

On Wed, Mar 10, 2010 at 3:54 AM, P.J. Eby <pje at telecommunity.com> wrote:
> At 03:03 PM 3/9/2010 -0600, Brad Allen wrote:
>> Today I was informed of an issue in which buildout (with the latest
>> setuptools) is not resolving version numbers properly, causing the
>> wrong package to be selected in some cases. The cause identified was
>> having '_' in the package name.
> I suspect there is a miscommunication or misunderstanding somewhere.  It is
> perfectly acceptable to have a '_' in a package name or project name.  This:
>> | >>> a="jiva_interface-2.3.6-py2.6.egg"
>> | >>> b="jiva_interface-2.3.8-py2.6.egg"
>> | >>> pkg_resources.parse_version(a)
> Is the wrong API to use to parse an egg filename, as parse_version() is for
> parsing a version that's already extracted from a filename.  This is the
> right API for extracting a version from a filename:
>>>> pkg_resources.Distribution.from_filename(a).version
> '2.3.6'
>>>> pkg_resources.Distribution.from_filename(b).version
> '2.3.8'
>>>> pkg_resources.Distribution.from_filename(c).version
> '0.1.1'
>>>> pkg_resources.Distribution.from_filename(d).version
> '0.1.2'
> And here's the correct one for extracting the parsed version from a
> filename:
>>>> pkg_resources.Distribution.from_filename(a).parsed_version
> ('00000002', '00000003', '00000006', '*final')
>>>> pkg_resources.Distribution.from_filename(b).parsed_version
> ('00000002', '00000003', '00000008', '*final')
>>>> pkg_resources.Distribution.from_filename(c).parsed_version
> ('00000000', '00000001', '00000001', '*final')
>>>> pkg_resources.Distribution.from_filename(d).parsed_version
> ('00000000', '00000001', '00000002', '*final')
> As you can see, these APIs work just fine, so the example given is a red
> herring, unless Buildout is using the APIs incorrectly (which I really doubt
> it is).
> Usually, the situation where people run into trouble with unusual package
> names or filenames is when they produce a source distribution manually, or
> by using something other than distutils/setuptools (that has different
> filename escaping rules), or when they manually rename a file before
> uploading, and expect it to still work the same.
> It would be a good idea for you to check which of these things (if any) is
> taking place, and provide details of the specific problem, with steps to
> reproduce it, since the example given probably has nothing to do with it.

I spend some time with Buildout and setuptools code to identify the issue.
I will try to explain my findings.

1. Buildout is relying on pkg_resources.Requirement.parse function to
   get the "project_name" like this:


   I can see from the code of `Requirement` class that, the `__init__`
   method is deprecated and recommend to use `parse`
   function. Does this mean that we should not use the attributes
   of an instance of `Requirement` class?  This is very important as
   the `parse` function return a list of instances of `Requirement` class.

  So, if it is acceptable to use the "project_name" attribute, then
  Buildout can rely on it, right ?

  Here is beginning of `Requirement` class:

   class Requirement:
       def __init__(self, project_name, specs, extras):
           """DO NOT CALL THIS UNDOCUMENTED METHOD; use Requirement.parse()!"""

2. This is the code which get the "project_name" in the same `__init__` method:

       self.unsafe_name, project_name = project_name, safe_name(project_name)
       self.project_name, self.key = project_name, project_name.lower()

   I looked at the "safe_name" method:

   def safe_name(name):
       """Convert an arbitrary string to a standard distribution name

       Any runs of non-alphanumeric/. characters are replaced with a
single '-'.
       return re.sub('[^A-Za-z0-9.]+', '-', name)

  According to this code, this will be the result:




   Is this behavior correct ?

   If you think what setuptools doing is fine, we will make changes
   in Buildout code to use the "safe_name" method where ever it directly
   get "project_name".

Baiju M

More information about the Distutils-SIG mailing list