[Distutils] use of '_' in package name causing version parsing issue?

P.J. Eby pje at telecommunity.com
Thu Mar 11 16:43:02 CET 2010


At 12:38 PM 3/11/2010 +0530, Baiju M wrote:
>On Thu, Mar 11, 2010 at 11:05 AM, Baiju M <mbaiju at zeomega.com> wrote:
> > If "_" is a valid project_name identifier, why it is replaces with "-" ?

In order to have a canonicalized name form which can be escaped in 
filenames for unambiguous identification of an egg's project and version.

Egg filenames use '-' as a separator between name, version, python 
version, and platform.  A '-' in any of these components is escaped 
as '_', so that the '-' remains a viable and unambiguous 
separator.  This means that '_' gets turned back into a '-' when 
unescaped, so the mapping between '_' and '-' is part of the 
safe_name canonical form.


>There nearly 300 packages in PyPI with "_" in the package name.
>For all the packages built using Setuptools, the "Name" field in
>the PKG-INFO file is replaced with "-".
>
>I checked some of the packages built with "distutils.core" [1]
>Distutils is not replacing "Name" field in PKG-INFO file
>with "-".
>
>Why Setuptools is behaving different from Distutils ?

Because distutils wasn't built in a world where: package names needed 
to be uniquely and unambiguously machine-parseable from 
filenames.  The code that easy_install has for dealing with 
distutils-named source distributions has to guess at possible 
interpretations of those filenames, because distutils filenames don't 
distinguish between a '-' in a name or version, and a '-' *between* 
names and versions.

Ultimately, the simplest way to deal with this was to treat runs of 
'_' (or any other non-alphanum, non-dot character), as being 
identical to a single '-'.


>Buildout has a functionality to "pin-down" ("lock down"/"nail down") versions
>of eggs (distribution?).  There is another functionality to enforce
>"pinining-down"
>versions of all eggs used in a particular Buildout configuration.  If we
>use "_" as the package name (distribution name?), this functionality is not
>working.

Your comparisons should be based on the 'key' attribute of 
Distribution and Requirement objects, rather than relying on direct 
string operations of your own.  The 'key' attribute contains a form 
of project name suitable for equality/inequality comparisons.

In other words, you should not take unparsed data from your 
configuration and compare it against pkg_resources attributes.  Use 
constructors like using Requirement.parse() and 
Distribution.from_filename() to create objects with 'key' attributes, 
then compare keys, or just use Requirement.__contains__.  For example:

     if someDistribution in Requirement.parse(projname+'=='+exactversion):
          # someDistribution is exactly version exactversion of projname

The pkg_resources API is there precisely so that you don't have to 
know all the low-level details like syntax rules and escaping.





More information about the Distutils-SIG mailing list