[Distutils] Parallel installation of incompatible versions
PJ Eby
pje at telecommunity.com
Tue Mar 19 19:06:42 CET 2013
On Mon, Mar 18, 2013 at 6:04 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> pkg_resources.requires() is our only current solution for parallel
> installation of incompatible versions. This can be made to work and is
> a lot better than the nothing we had before it was created, but also
> has quite a few issues (and it can be a nightmare to debug when it
> goes wrong).
>
> Based on the exchanges with Mark McLoughlin the other week, and
> chatting to Matthias Klose here at the PyCon US sprints, I think I
> have a design that will let us support parallel installs in a way that
> builds on existing standards, while behaving more consistently in edge
> cases and without making sys.path ridiculously long even in systems
> with large numbers of potentially incompatible dependencies.
>
> The core of this proposal is to create an updated version of the
> installation database format that defines semantics for *.pth files
> inside .dist-info directories.
>
> Specifically, whereas *.pth files directly in site-packages are
> processed automatically when Python starts up, those inside dist-info
> directories would be processed only when explicitly requested
> (probably through a new distlib API). The processing of the *.pth file
> would insert it into the path immediately before the path entry
> containing the .dist-info directory (this is to avoid an issue with
> the pkg_resources insert-at-the-front-of-sys.path behaviour where
> system packages can end up shadowing those from a local source
> checkout, without running into the issue with
> append-to-the-end-of-sys.path where a specifically requested version
> is shadowed by a globally installed version)
>
> To use CherryPy2 and CherryPy3 on Fedora as an example, what this
> would allow is for CherryPy3 to be installed normally (i.e. directly
> in site-packages), while CherryPy2 would be installed as a split
> install, with the .dist-info going into site-packages and the actual
> package going somewhere else (more on that below). A cherrypy2.pth
> file inside the dist-info directory would reference the external
> location where cherrypy 2.x can be found.
>
> To use this at runtime, you would do something like:
>
> distlib.some_new_requires_api("CherryPy (2.2)")
> import cherrypy
>
> The other part of this question is how to avoid the potential
> explosion of one sys.path entry per dependency. The first part of that
> is that for cases where there is no incompatible version installed,
> there won't be a *.pth file, and hence no extra sys.path entry (the
> module/package will just be installed directly into site-packages as
> usual).
>
> The second part has to do with a possible way to organise the
> versioned installs: group them by the initial fragment of the version
> number according to semantic versioning. For example, define a
> "versioned-packages" directory that sits adjacent to "site-packages".
> When doing the parallel install of CherryPy2 the actual *code* would
> be installed into "versioned-packages/2/", with the cherrypy2.pth file
> pointing to that directory. For 0.x releases, there would be a
> directory per minor version, while for higher releases, there would
> only be a directory per major version.
>
> The nice thing though is that Python wouldn't actually care about the
> actual layout of the installed versions, so long as the *.pth files in
> the dist-info directories described the mapping correctly.
Could you perhaps spell out why this is better than just dropping .whl
files (or unpacked directories) into site-packages or equivalent?
Also, one thing that actually confuses me about this proposal is that
it sounds like you are saying you'd have two CherryPy.dist-info
directories in site-packages, which sounds broken to me; the whole
point of the existing protocol for .dist-info was that it allowed you
to determine the importable versions from a single listdir(). Your
approach would break that feature, because you'd have to:
1. Read each .dist-info directory to find .pth files
2. Open and read all the .pth files
3. Compare the .pth file contents with sys.path to find out what is
actually *on* sys.path
This is a lot more complexity and I/O overhead than PEP 376 and its
antecedents in pkg_resources et al.
In contrast, if you use .whl files or directories, you can both
determine the available versions *and* the active versions from a
single directory read. And on everything but Windows, those could be
symlinks to the target location rather than an actual file or
directory, thus giving you the same kind of layout flexibility as what
you've proposed.
(Or, if you want a solution that works the same across platforms, just
re-invent .egg-link files, which are basically a super-symlink
anyway.)
More information about the Distutils-SIG
mailing list