[Distutils] use of '_' in package name causing version parsing issue?
pje at telecommunity.com
Thu Mar 11 16:43:02 CET 2010
At 12:38 PM 3/11/2010 +0530, Baiju M wrote:
>On Thu, Mar 11, 2010 at 11:05 AM, Baiju M <mbaiju at zeomega.com> wrote:
> > If "_" is a valid project_name identifier, why it is replaces with "-" ?
In order to have a canonicalized name form which can be escaped in
filenames for unambiguous identification of an egg's project and version.
Egg filenames use '-' as a separator between name, version, python
version, and platform. A '-' in any of these components is escaped
as '_', so that the '-' remains a viable and unambiguous
separator. This means that '_' gets turned back into a '-' when
unescaped, so the mapping between '_' and '-' is part of the
safe_name canonical form.
>There nearly 300 packages in PyPI with "_" in the package name.
>For all the packages built using Setuptools, the "Name" field in
>the PKG-INFO file is replaced with "-".
>I checked some of the packages built with "distutils.core" 
>Distutils is not replacing "Name" field in PKG-INFO file
>Why Setuptools is behaving different from Distutils ?
Because distutils wasn't built in a world where: package names needed
to be uniquely and unambiguously machine-parseable from
filenames. The code that easy_install has for dealing with
distutils-named source distributions has to guess at possible
interpretations of those filenames, because distutils filenames don't
distinguish between a '-' in a name or version, and a '-' *between*
names and versions.
Ultimately, the simplest way to deal with this was to treat runs of
'_' (or any other non-alphanum, non-dot character), as being
identical to a single '-'.
>Buildout has a functionality to "pin-down" ("lock down"/"nail down") versions
>of eggs (distribution?). There is another functionality to enforce
>versions of all eggs used in a particular Buildout configuration. If we
>use "_" as the package name (distribution name?), this functionality is not
Your comparisons should be based on the 'key' attribute of
Distribution and Requirement objects, rather than relying on direct
string operations of your own. The 'key' attribute contains a form
of project name suitable for equality/inequality comparisons.
In other words, you should not take unparsed data from your
configuration and compare it against pkg_resources attributes. Use
constructors like using Requirement.parse() and
Distribution.from_filename() to create objects with 'key' attributes,
then compare keys, or just use Requirement.__contains__. For example:
if someDistribution in Requirement.parse(projname+'=='+exactversion):
# someDistribution is exactly version exactversion of projname
The pkg_resources API is there precisely so that you don't have to
know all the low-level details like syntax rules and escaping.
More information about the Distutils-SIG