[Distutils] PEP 376 - from PyPM's point of view

Tarek Ziadé ziade.tarek at gmail.com
Wed Jul 15 11:01:24 CEST 2009


On Tue, Jul 14, 2009 at 2:12 AM, Sridhar
Ratnakumar<sridharr at activestate.com> wrote:
> Here are my comments regarding PEP 376 with respect to PyPM (the Python
> package manager being developd at ActiveState)
>
>
> Multiple versions: I understand that the PEP does not support
> installation (thus uninstallation) of multiple versions of the same
> package. Should this be explicitly mentioned in the PEP -- as
> `get_distribution` API accepts only `name` argument, and not a `version`
> argument?

That's another can of worms ;)

Before I answer here's a bit of background, i's a bit long but required, sorry

For people that don't want to read the rest, here's the idea :
multiple version support imho should
be introduced later, if they are to be introduced, by extending PEP
302 protocol.

The long explanantion now:

given a "foo" package, containing a "bar" module, multiple versions
support implies to do one of these:

1 - a custom PEP 302-like loader/importer that picks a version of
"foo" when the code imports the "bar" module.
  this works if the "foo" package is not directly available in
sys.path, and if the custom loader/importer is put in sys.meta_path
  for example.  If "foo 0.9" is located in /var/packages/foo/0.9 and
if "foo 1.0" is in  /var/packages/foo/1.0,
  The loader will select the right foo package to load and return with
through a loader that scans   /var/packages/foo/*

  To make it work it requires 2 things :
        a/ a version comparison system (see PEP 386) that will make
the loader pick the "latest" version by default
        b/ an API that will force the loader to pick one particular version

2 - changing the paths in sys.path to include the path containing the
right version, and let the existing importer/loader do the work.
That's what setuptools does with its multiple version system: an API
called "require" will let you change sys.path on the fly

           >>> from pkg_resources import require
           >>> require('docutils==0.4')     <--- looks for a docutils
egg distribution and adds it in the path

So if we support multiple versions in Python, (I'd love too). PEP 376
would need to be able to find the various versions
of each distribution, not by scanning sys.path but rather by scanning
a arbitrary collection of directories, then publishing the
right ones in sys.path (with a PEP 302 loader, or ala setuptools by
inserting them in sys.path)

In other words this would require changing the way the distributions
are stored. e.g. in self-contained eggs or in a brand-new
storage tree. (I am currently experimenting this with "virtual
site-packages" see http://bitbucket.org/tarek/vsp/src/tip/README.txt)

But as we said earlier, people might want to store their modules
anywhere (on a sql database, on Mars, etc.) and provide
a PEP 302-like loader for them. PJE has "eggs" but John Doe might want
to store its packages differently and provide an importer/loader for
them.

So each one of them provides a "package manager", which should composed of :

A- a loader/importer system
B- an installation system (that is easy_install -m for setuptools)
C- query APIs
D- a version comparison system
E- an uninstaller

So the real solution is to work with PEP 302 importers/loaders (A)
(e.g. "package managers")

Which means that PEP 302 need to be changed to become
'distribution-aware' as Paul said.
Because we would then be able to browse distributions (C) that are not
already loaded in sys.path, so work on two versions
of the same distribution.

but some open questions remains:

It also implies that each package manager provides its installer (B)
and a version comparison system (D)

I'm not sure about the way package installers could be declared. Plus,
how people would deal with several installers ?
For the version comparison system I am not sure either, but it would
require to have one global version comparison
system to rule them all otherwise some conflicts may occur.

So there's no plan to support multiple versions yet, because that
requires another PEP imho.

Since distutils is a package manager in some ways (it provides an
installer, and stores distributions that are made
available in sys.path) My feeling is that we need first to finish
what's missing to make it fully usable (eg query apis + uninstaller)

Then maybe think about extending PEP 302

>
>> get_distribution(name) -> Distribution or None.
>> Scans all elements in sys.path and looks for all directories ending
>> with .egg-info. Returns a Distribution corresponding to the .egg-info
>> directory that contains a PKG-INFO that matches name for the name
>> metadata.
>> Notice that there should be at most one result. The first result
>> founded is returned. If the directory is not found, returns None.
>
> Some packages have package names with mixed case. Example: ConfigObj
> .. as registered in setup.py. However, other packages such as turbogears
> specifies "configobj" (lowercase) in their install_requires.
>
> Is `get_distribution(name)` supposed to handle mixed cases? Will it
> match both 'ConfigObj' and 'configobj'?

As PJE said, we need normalization here yes.

Right now PyPI is case insensitive for its index:

http://pypi.python.org/simple/ConfigObj ==
http://pypi.python.org/simple/configobj

But in the meantime, IIRC, the XML-RPC apis are case sensitive, and so the
html browsing. easy_install is case insensitive though, because it
uses the index.

So we should be case-insensitive everywhere, so in PEP 376 too.

>
>> get_installed_files(local=False) -> iterator of (path, md5, size)
>
> Will this also return the directories /created/ during the installation?
> For example, will it also contain the entry "docutils" .. along with
> "docutils/__init__.py"?

I don't think it's necessary to add "docutils" if
"docutils/__init__.py" is present

But for empty directories added during installation we should add the I guess.

So, I'll add a note.

>
> If not, how is the installer (pip, pypm, etc..) supposed to know which
> directories to remove (docutils/) and which directories not to remove
> (site-packages/, bin/, etc..)?
>
>> The new version of PEP 345 (XXX work in progress) extends the Metadata
>> standard and fullfills the requirements described in PEP 262, like the
>> REQUIRES section.
>
> Can you tell more about this?
>
> I see that PEP 262 allows both distributions names ('docutils') and
> modules/packages ('roman.py') in the 'Requires:' section. Is this how
> the new PEP is going to adhere to? Or, is it going to adhere to PEP
> 345's way of allowing *only* modules/packages?

The plan is to add what setuptools called "installed_requires", so
you can tell which *distributions* should be installed, no matter if
they are composed
of a single module, or many packages.

>
> In practice, I noticed that packages usually specify distribution names
> in their 'Requires:' file (or install_requires.txt in the case of
> setuptools). Hence, PyPM *assumes* the install requirements to be
> distribution name. But then .. most distributions have the same name as
> their primary module/package.

That's it yes: it will be distribution aware. If a module or package
has the same name
than the distribution name, it will make no difference.

>
> Ok, so PEP 345 also specifies the 'Provides:' header. Does
> easy_install/pip make use 'Provides:' at all when resolving
> dependencies? For example, does 'pip install sphinx' go look for all
> distributions that 'provides' the 'docutils' provision.. or does it
> simply get the distribution named 'docutils'?

setuptools doesn't. I don't think pip does.

btw: is PyPM a public project ?

Regards
Tarek

-- 
Tarek Ziadé | http://ziade.org


More information about the Distutils-SIG mailing list