[Python-Dev] Distutils and Distribute roadmap (and some words on Virtualenv, Pip)

Ian Bicking ianb at colorstudy.com
Fri Oct 9 03:22:13 CEST 2009


I'm coming in late and breaking threading, but wanted to reply to
Tarek's original email:

> - easy_install is going to be deprecated ! use Pip !

Cool!  I wouldn't have written pip if I didn't think it would improve
substantially on easy_install.

Incidentally (because I know people get really enthused about this)
Carl Meyer just contributed a feature to pip to do atomic
uninstallation.

Someone mentioned that easy_install provided some things pip didn't;
outside of multi-versioned installs (which I'm not very enthusiastic
about) I'm not sure what this is?

>   - distribute.resources: that's the old pkg_resources, but reorganized in clean, pep-8 modules. This package will
>      only contain the query APIs and will focus on being PEP 376 compatible. We will promote its usage and see if Pip wants
>      to use it as a basis. And maybe PyPM once it's open source ? (<hint> <hint>).
> 	It will probably shrink a lot though, once the stdlib provides PEP 376 support.

This seems easy enough to use in pip.

>    - distribute.index: that's package_index and a few other things. everything required to interact with PyPI. We will promote
>      its usage and see if Pip wants to use it as a basis.

This is a little tricky.  Primarily because there's a fair amount of
logic involved in the indexing (going around to different URLs,
parsing links, finding stuff).  So long as there is logic, something
can go wrong -- often not in the package itself, but simple user error
(e.g., it doesn't look where the user thinks it should, or a link is
malformed, etc).  Because of this, and as a general design goal of
pip, I want to show as much as I can about what it is doing and why.
This is primarily tied into pip's logging system (which is oriented
towards command-line output, and isn't the standard logging system).
Also it tracks *why* it got to a certain links.  These are the two
things I can think of where the index code in pip is tied to pip, and
why it would be hard to use an external system.


> = Virtualenv and the multiple version support in Distribute =
>
> (I am not saying "We" here because this part was not discussed yet
> with everyone)
>
> Virtualenv allows you to create an isolated environment to install some distribution without polluting the
> main site-packages, a bit like a user site-packages.
>
> My opinion is that this tool exists only because Python doesn't
> support the installation of multiple versions for the same
> distributions.
> But if PEP 376 and PEP 386 support are added in Python, we're not far
> from being able to provide multiple version support with
> the help of importlib.

Before making workingenv (virtualenv's predecessor) I actively tried
to use Setuptools' multi-version support.  I found it very
unsuccessful.  I don't think it was due to any problems with
Setuptools -- maybe a slight problem was the conflict between "active"
eggs and "multiversion" eggs (where active eggs would be more likely
to cause conflicts, while multiversion eggs aren't available until you
specifically require them).  But that was just awkward, I don't think
it was the real problem.

The real problem is that a set of packages that makes up a working
application is something separate from any library.  And you can only
know that an application works with an exact set of libraries.  Every
update can break a working application (and with fairly high
probability).  You can't know what updates are safe.  And it's really
a bit tricky to even be sure you know what libraries a package really
requires -- lots of libraries might be incidentally available but no
longer formally required.  (Someone mentioned a coworker that only
installed packages with easy_install -m, because he said it kept him
honest -- only packages that are explicitly required would be
available.  But most people don't do this, and it really only solves
the one problem of undeclared dependencies)

The way both virtualenv and buildout handle this is that libraries
will have a single, static version until you explicitly do something
to update that version.  Both are somewhat focused on a functional
unit -- like one virtualenv environment for one task, or one buildout
config for one application.  Buildout allows for a globally shared set
of versioned eggs, but that's really just a little optimization (for
disk space or installation speed) -- each egg is brought in only
explicitly, at build time, and not as an option during the program's
runtime.

This is verifiable, stable, and to varying degrees concrete
(virtualenv being more concrete than buildout, which tends more
towards the declarative).

What virtualenv does could certainly be in the Python interpreter (and
much more compact as a result, I am sure).  PYTHONHOME does it to a
degree, though binding a script to a environment through the
interpreter listed in #! is more stable than the implicit environment
of PYTHONHOME.  workingenv used an environmental variable (PYTHONPATH,
before PYTHONHOME existed) and it caused problems.  Also virtualenv
offers more system isolation.

If I had my way, buildout would use virtualenv and throw away its
funny script generation.  If virtualenv had existed before buildout
began development, probably things would have gone this way.  I think
it would make the environment more pleasant for buildout users.  Also
I wish it used pip instead of its own installation procedure (based on
easy_install).  I don't think the philosophical differences are that
great, and that it's more a matter of history -- because the code is
written, there's not much incentive for buildout to remove that code
and rely on other libraries (virtualenv and pip).

--
Ian Bicking  |  http://blog.ianbicking.org  |  http://topplabs.org/civichacker


More information about the Python-Dev mailing list