[Distutils] Parallel installation of incompatible versions

PJ Eby pje at telecommunity.com
Wed Mar 20 22:22:44 CET 2013


On Wed, Mar 20, 2013 at 8:29 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> I'm not wedded to using *actual* pth files as a cross-platform linking
> solution - a more limited format that only supported path additions,
> without the extra powers of pth files would be fine. The key point is
> to use the .dist-info directories to bridge between "unversioned
> installs in site packages" and "finding parallel versions at runtime
> without side effects on all Python applications executed on that
> system" (which is the problem with using a pth file in site packages
> to bootstrap the parallel versioning system as easy_install does).

So why not just make a new '.pth-info' file or directory dropped into
a sys.path directory for this purpose?  Reusing .dist-info as an
available package (vs. an *importable* package) looks like a bad idea
from a compatibility point of view.  (For example, it's immediately
incompatible with Distribute, which would interpret the redundant
.dist-info as being importable from that directory.)


> If a distribution has been installed in site-packages (or has an
> appropriate *.pth file there), there won't be any *.pth file in the
> .dist-info directory.

Right, but if this were the protocol, you wouldn't tell what's
*already on sys.path* without reading all those .dist-info directories
to see if they *had* .pth files.  You'd have to look for the ones that
were missing a .pth file, in other words, in order to know which of
those .dist-info's represented a package that was actually importable
from that directory.


> The *.pth file will only be present if the package has been installed *somewhere else*.

...which is precisely the thing that makes it incompatible with PEP
376 (and Distribute ATM).  ;-)


> However, it occurs to me that we can do this differently, by
> explicitly involving a separate directory that *isn't* on sys.path by
> default, and use a path hook to indicate when it should be accessed.

Why not just put a .pth-info file that points to the other location,
or whatever?  Then it's still discoverable, but you don't have to open
it unless you intend to add it to sys.path (or an import hook or
whatever).

If it needs to list a bunch of different directories in it, or
whatever, doesn't matter.  The point is, using a file in the *same*
sys.path directory saves a metric tonne of complexity in sys.path
management.  Plus, you get the available packages in a single
directory read, and you can open whatever files you need in order to
pick up additional information in the case of needing a non-default
package.


> Under this version of the proposal, PEP 376 would remain unchanged,
> and would effectively become the "database of installed distributions
> available on sys.path by default".

That's what it is *now*.  Or more precisely, it's a directory of
packages that would be importable if a given directory is present on
sys.path.  It doesn't say anything about sys.path as a whole.


> - rather than the contents being processed directly from sys.path, we
> would add a "<versioned-packages>" entry to sys.path with a path hook
> that maps to a custom module finder that handles the extra import
> locations without the same issues as the current approach to modifying
> sys.path in pkg_resources (which allows shadowing development versions
> with installed versions by inserting at the front), or the opposite
> problem that would be created by appending to the end (allowing
> default versions to shadow explicitly requested versions)

Note that you can do this without needing a separate sys.path entry.
You can give alternate versions whatever precedence they *would* have
had, by replacing the finder for the relevant directory.

But it would be better if you could be clearer about what precedence
you want these other packages to have, relative to the matching
sys.path entries.  You seem to be speaking in terms of a single
site-packages and single versioned-packages directory, but
applications and users can have more complicated paths than that.  For
example, how do PYTHONPATH directories factor into this?  User-site
packages?  Application plugin directories?  Will all of these need
their own markers?

That's why I think we should focus on *individual* directories (the
way PEP 376 does), rather than trying to define an overall precedence
system.

While there are some challenges with easy_install.pth, the basic
precedence concept it uses is sound: an encapsulated package
discovered in a given directory takes precedence over unencapsulated
packages in the same directory.

The place where easy_install falls down is in the implementation: not
only does it have to munge sys.path in order to insert those
non-defaults, it also installs *everything* in an encapsulated form,
making a huge sys.path.

But you can take the same basic idea and apply it to an import hook; I
just think that rather than having the extra directory, it's less
coupling and complexity if we look at the level of directories rather
than sys.path as a whole.

This still lets a system installer put stuff wherever it wants; it
just has to also write a .pth-info (or whatever you want to call it)
file in site-packages, telling Python where to find it.

It also lets plugin-oriented systems use the same approach, and
PYTHONPATH, and user-site packages, etc., venvs, etc. all work in
exactly the same way, without needing to reinvent wheels or share a
single (and privileged) hook.


> The versioned import hook would work just like normal sys.path based
> import (i.e. maintaining a sequence of path entries, using
> sys.modules, sys.path_hooks, sys.path_importer_cache), the only
> difference is that the set of paths it checks would initially be
> empty. Calls to the new API in distlib would modify the *versioned*
> path, effectively inserting all those paths at the point in sys.path
> where the "<versioned-packages>" marker is placed, rather than
> appending them to the beginning or end. The API that updates the paths
> handled by the versioned import hook would also take care of detecting
> and complaining about incompatible version constraints.

How does this interact with an application that uses both
system-installed packages and a user-supplied plugin directory?  (This
also sounds like a recipe for new breakage and debug issues caused by
putting the marker in the wrong place.)


More information about the Distutils-SIG mailing list