Parallel installation of incompatible versions

pkg_resources.requires() is our only current solution for parallel installation of incompatible versions. This can be made to work and is a lot better than the nothing we had before it was created, but also has quite a few issues (and it can be a nightmare to debug when it goes wrong). Based on the exchanges with Mark McLoughlin the other week, and chatting to Matthias Klose here at the PyCon US sprints, I think I have a design that will let us support parallel installs in a way that builds on existing standards, while behaving more consistently in edge cases and without making sys.path ridiculously long even in systems with large numbers of potentially incompatible dependencies. The core of this proposal is to create an updated version of the installation database format that defines semantics for *.pth files inside .dist-info directories. Specifically, whereas *.pth files directly in site-packages are processed automatically when Python starts up, those inside dist-info directories would be processed only when explicitly requested (probably through a new distlib API). The processing of the *.pth file would insert it into the path immediately before the path entry containing the .dist-info directory (this is to avoid an issue with the pkg_resources insert-at-the-front-of-sys.path behaviour where system packages can end up shadowing those from a local source checkout, without running into the issue with append-to-the-end-of-sys.path where a specifically requested version is shadowed by a globally installed version) To use CherryPy2 and CherryPy3 on Fedora as an example, what this would allow is for CherryPy3 to be installed normally (i.e. directly in site-packages), while CherryPy2 would be installed as a split install, with the .dist-info going into site-packages and the actual package going somewhere else (more on that below). A cherrypy2.pth file inside the dist-info directory would reference the external location where cherrypy 2.x can be found. To use this at runtime, you would do something like: distlib.some_new_requires_api("CherryPy (2.2)") import cherrypy The other part of this question is how to avoid the potential explosion of one sys.path entry per dependency. The first part of that is that for cases where there is no incompatible version installed, there won't be a *.pth file, and hence no extra sys.path entry (the module/package will just be installed directly into site-packages as usual). The second part has to do with a possible way to organise the versioned installs: group them by the initial fragment of the version number according to semantic versioning. For example, define a "versioned-packages" directory that sits adjacent to "site-packages". When doing the parallel install of CherryPy2 the actual *code* would be installed into "versioned-packages/2/", with the cherrypy2.pth file pointing to that directory. For 0.x releases, there would be a directory per minor version, while for higher releases, there would only be a directory per major version. The nice thing though is that Python wouldn't actually care about the actual layout of the installed versions, so long as the *.pth files in the dist-info directories described the mapping correctly. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Mon, Mar 18, 2013 at 3:04 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
The second part has to do with a possible way to organise the versioned installs: group them by the initial fragment of the version number according to semantic versioning. For example, define a "versioned-packages" directory that sits adjacent to "site-packages". When doing the parallel install of CherryPy2 the actual *code* would be installed into "versioned-packages/2/", with the cherrypy2.pth file pointing to that directory. For 0.x releases, there would be a directory per minor version, while for higher releases, there would only be a directory per major version.
Jason pointed out this wouldn't actually work, since you might have spurious version conflicts in this model (e.g. if you require v2.x of one dependency, but v3.x of another). So it would need to be 1 directory per parallel installed versioned package. The "but what about long sys.paths?" problem can be dealt with as a performance issue for the import system. Cheers, Nick.
The nice thing though is that Python wouldn't actually care about the actual layout of the installed versions, so long as the *.pth files in the dist-info directories described the mapping correctly.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, Mar 19, 2013 at 1:02 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
The "but what about long sys.paths?" problem can be dealt with as a performance issue for the import system.
And already has been, actually. ;-) In addition to the changes made in the import system for 3.3, there's another improvement possible, relative to today's easy_install-based path lengths. Currently, easy_install dumps everything into subdirectories, so sys.path is *always* long for the default case, and *just* as long for non-default apps. However, if instead of just listing default versions in easy-install.pth, those versions were actually installed PEP 376-style, and only the non-default versions had to be added to sys.path for the apps that need them, then you'd see a dramatic shortening of sys.path for *all* apps and scenarios. Today, if you have N default libraries and an average of M non-default libraries per app, plus C as the constant minimum sys.path length, then sys.path is N+C by default, and N+C+M for each app that uses non-default libraries. But, under a hybrid scheme of PEP 376 for defaults plus an extension for non-defaults, the default sys.path length is C, and C+M for the apps needing non-default versions. In other words, N disappears -- and "N" is usually a lot bigger than C or M. TBH, if I had access to the time machine right now, easy_install would have worked this way from the start, instead of using easy-install.pth as a version-switching mechanism. (The main reason it didn't work out that way to begin with, is because the .egg-info concept wasn't invented until much later in easy_install's development.)

On Mon, Mar 18, 2013 at 6:04 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
pkg_resources.requires() is our only current solution for parallel installation of incompatible versions.
Well, one of them. Buildout is another. At a lower level, self-contained things (eggs, wheels, jars, Mac .app directories) is the solution. requires() and buildout are just higher-level applications. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton

On Mon, Mar 18, 2013 at 6:04 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
pkg_resources.requires() is our only current solution for parallel installation of incompatible versions. This can be made to work and is a lot better than the nothing we had before it was created, but also has quite a few issues (and it can be a nightmare to debug when it goes wrong).
Based on the exchanges with Mark McLoughlin the other week, and chatting to Matthias Klose here at the PyCon US sprints, I think I have a design that will let us support parallel installs in a way that builds on existing standards, while behaving more consistently in edge cases and without making sys.path ridiculously long even in systems with large numbers of potentially incompatible dependencies.
The core of this proposal is to create an updated version of the installation database format that defines semantics for *.pth files inside .dist-info directories.
Specifically, whereas *.pth files directly in site-packages are processed automatically when Python starts up, those inside dist-info directories would be processed only when explicitly requested (probably through a new distlib API). The processing of the *.pth file would insert it into the path immediately before the path entry containing the .dist-info directory (this is to avoid an issue with the pkg_resources insert-at-the-front-of-sys.path behaviour where system packages can end up shadowing those from a local source checkout, without running into the issue with append-to-the-end-of-sys.path where a specifically requested version is shadowed by a globally installed version)
To use CherryPy2 and CherryPy3 on Fedora as an example, what this would allow is for CherryPy3 to be installed normally (i.e. directly in site-packages), while CherryPy2 would be installed as a split install, with the .dist-info going into site-packages and the actual package going somewhere else (more on that below). A cherrypy2.pth file inside the dist-info directory would reference the external location where cherrypy 2.x can be found.
To use this at runtime, you would do something like:
distlib.some_new_requires_api("CherryPy (2.2)") import cherrypy
The other part of this question is how to avoid the potential explosion of one sys.path entry per dependency. The first part of that is that for cases where there is no incompatible version installed, there won't be a *.pth file, and hence no extra sys.path entry (the module/package will just be installed directly into site-packages as usual).
The second part has to do with a possible way to organise the versioned installs: group them by the initial fragment of the version number according to semantic versioning. For example, define a "versioned-packages" directory that sits adjacent to "site-packages". When doing the parallel install of CherryPy2 the actual *code* would be installed into "versioned-packages/2/", with the cherrypy2.pth file pointing to that directory. For 0.x releases, there would be a directory per minor version, while for higher releases, there would only be a directory per major version.
The nice thing though is that Python wouldn't actually care about the actual layout of the installed versions, so long as the *.pth files in the dist-info directories described the mapping correctly.
Could you perhaps spell out why this is better than just dropping .whl files (or unpacked directories) into site-packages or equivalent? Also, one thing that actually confuses me about this proposal is that it sounds like you are saying you'd have two CherryPy.dist-info directories in site-packages, which sounds broken to me; the whole point of the existing protocol for .dist-info was that it allowed you to determine the importable versions from a single listdir(). Your approach would break that feature, because you'd have to: 1. Read each .dist-info directory to find .pth files 2. Open and read all the .pth files 3. Compare the .pth file contents with sys.path to find out what is actually *on* sys.path This is a lot more complexity and I/O overhead than PEP 376 and its antecedents in pkg_resources et al. In contrast, if you use .whl files or directories, you can both determine the available versions *and* the active versions from a single directory read. And on everything but Windows, those could be symlinks to the target location rather than an actual file or directory, thus giving you the same kind of layout flexibility as what you've proposed. (Or, if you want a solution that works the same across platforms, just re-invent .egg-link files, which are basically a super-symlink anyway.)

FWIW I always thought the long sys.path problem was a bug that could be solved by improving sys.path.__repr__() On Tue, Mar 19, 2013 at 2:06 PM, PJ Eby <pje@telecommunity.com> wrote:
On Mon, Mar 18, 2013 at 6:04 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
pkg_resources.requires() is our only current solution for parallel installation of incompatible versions. This can be made to work and is a lot better than the nothing we had before it was created, but also has quite a few issues (and it can be a nightmare to debug when it goes wrong).
Based on the exchanges with Mark McLoughlin the other week, and chatting to Matthias Klose here at the PyCon US sprints, I think I have a design that will let us support parallel installs in a way that builds on existing standards, while behaving more consistently in edge cases and without making sys.path ridiculously long even in systems with large numbers of potentially incompatible dependencies.
The core of this proposal is to create an updated version of the installation database format that defines semantics for *.pth files inside .dist-info directories.
Specifically, whereas *.pth files directly in site-packages are processed automatically when Python starts up, those inside dist-info directories would be processed only when explicitly requested (probably through a new distlib API). The processing of the *.pth file would insert it into the path immediately before the path entry containing the .dist-info directory (this is to avoid an issue with the pkg_resources insert-at-the-front-of-sys.path behaviour where system packages can end up shadowing those from a local source checkout, without running into the issue with append-to-the-end-of-sys.path where a specifically requested version is shadowed by a globally installed version)
To use CherryPy2 and CherryPy3 on Fedora as an example, what this would allow is for CherryPy3 to be installed normally (i.e. directly in site-packages), while CherryPy2 would be installed as a split install, with the .dist-info going into site-packages and the actual package going somewhere else (more on that below). A cherrypy2.pth file inside the dist-info directory would reference the external location where cherrypy 2.x can be found.
To use this at runtime, you would do something like:
distlib.some_new_requires_api("CherryPy (2.2)") import cherrypy
The other part of this question is how to avoid the potential explosion of one sys.path entry per dependency. The first part of that is that for cases where there is no incompatible version installed, there won't be a *.pth file, and hence no extra sys.path entry (the module/package will just be installed directly into site-packages as usual).
The second part has to do with a possible way to organise the versioned installs: group them by the initial fragment of the version number according to semantic versioning. For example, define a "versioned-packages" directory that sits adjacent to "site-packages". When doing the parallel install of CherryPy2 the actual *code* would be installed into "versioned-packages/2/", with the cherrypy2.pth file pointing to that directory. For 0.x releases, there would be a directory per minor version, while for higher releases, there would only be a directory per major version.
The nice thing though is that Python wouldn't actually care about the actual layout of the installed versions, so long as the *.pth files in the dist-info directories described the mapping correctly.
Could you perhaps spell out why this is better than just dropping .whl files (or unpacked directories) into site-packages or equivalent?
Also, one thing that actually confuses me about this proposal is that it sounds like you are saying you'd have two CherryPy.dist-info directories in site-packages, which sounds broken to me; the whole point of the existing protocol for .dist-info was that it allowed you to determine the importable versions from a single listdir(). Your approach would break that feature, because you'd have to:
1. Read each .dist-info directory to find .pth files 2. Open and read all the .pth files 3. Compare the .pth file contents with sys.path to find out what is actually *on* sys.path
This is a lot more complexity and I/O overhead than PEP 376 and its antecedents in pkg_resources et al.
In contrast, if you use .whl files or directories, you can both determine the available versions *and* the active versions from a single directory read. And on everything but Windows, those could be symlinks to the target location rather than an actual file or directory, thus giving you the same kind of layout flexibility as what you've proposed.
(Or, if you want a solution that works the same across platforms, just re-invent .egg-link files, which are basically a super-symlink anyway.) _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

On Tue, Mar 19, 2013 at 11:06 AM, PJ Eby <pje@telecommunity.com> wrote:
Could you perhaps spell out why this is better than just dropping .whl files (or unpacked directories) into site-packages or equivalent?
I need a solution that will also work for packages installed by the system installer - in fact, that's the primary use case. For self-contained installation independent of the system Python, people should be using venv/virtualenv, zc.buildout, software collections (a Fedora/RHEL tool in the same space), or a similar "isolated application" solution. System packages will be spread out according to the FHS, and need to work relatively consistently for every language the OS supports (i.e. all of them), so long term solutions that assume the use of Python-specific bundling formats for the actual installation are not sufficient in my view. I also want to create a database of parallel installed versions that can be used to avoid duplication across virtual environments and software collections by using .pth files to reference a common installed version rather than having to use symlinks or copies of the files. I'm not wedded to using *actual* pth files as a cross-platform linking solution - a more limited format that only supported path additions, without the extra powers of pth files would be fine. The key point is to use the .dist-info directories to bridge between "unversioned installs in site packages" and "finding parallel versions at runtime without side effects on all Python applications executed on that system" (which is the problem with using a pth file in site packages to bootstrap the parallel versioning system as easy_install does).
Also, one thing that actually confuses me about this proposal is that it sounds like you are saying you'd have two CherryPy.dist-info directories in site-packages, which sounds broken to me; the whole point of the existing protocol for .dist-info was that it allowed you to determine the importable versions from a single listdir(). Your approach would break that feature, because you'd have to:
1. Read each .dist-info directory to find .pth files 2. Open and read all the .pth files 3. Compare the .pth file contents with sys.path to find out what is actually *on* sys.path
If a distribution has been installed in site-packages (or has an appropriate *.pth file there), there won't be any *.pth file in the .dist-info directory. The *.pth file will only be present if the package has been installed *somewhere else*. However, it occurs to me that we can do this differently, by explicitly involving a separate directory that *isn't* on sys.path by default, and use a path hook to indicate when it should be accessed. Under this version of the proposal, PEP 376 would remain unchanged, and would effectively become the "database of installed distributions available on sys.path by default". These files would all remain available by default, preserving backwards compatibility for the vast majority of existing software that doesn't use any kind of parallel install system. We could then introduce a separate "database of all installed distributions". Let's use the "versioned-packages" name, and assume it lives adjacent to the existing "site-packages". The difference between this versioned-packages directory and site-packages would be that: - it would never be added to sys.path itself - multiple .dist-info directories for different versions of the same distribution may be present - distributions are installed into named-and-versioned subdirectories rather than directly into versioned-packages - rather than the contents being processed directly from sys.path, we would add a "<versioned-packages>" entry to sys.path with a path hook that maps to a custom module finder that handles the extra import locations without the same issues as the current approach to modifying sys.path in pkg_resources (which allows shadowing development versions with installed versions by inserting at the front), or the opposite problem that would be created by appending to the end (allowing default versions to shadow explicitly requested versions) We would then add some new version constraint API in distlib to: 1. Check the PEP 376 db. If the version identified there satisfies the constraint, fine, we leave the import state unmodified. 2. If no suitable version is found, check the new versioned-packages directory. 3. If a suitable parallel installed version is found, we check its dist-info directory for the details needed to update the set of paths processed by the versioned import hook. The versioned import hook would work just like normal sys.path based import (i.e. maintaining a sequence of path entries, using sys.modules, sys.path_hooks, sys.path_importer_cache), the only difference is that the set of paths it checks would initially be empty. Calls to the new API in distlib would modify the *versioned* path, effectively inserting all those paths at the point in sys.path where the "<versioned-packages>" marker is placed, rather than appending them to the beginning or end. The API that updates the paths handled by the versioned import hook would also take care of detecting and complaining about incompatible version constraints. It may even be possible to update pkg_resources.requires() to work this way, potentially avoiding the need for the easy_install.pth file that has side effects on applications that don't even use pkg_resources. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Mar 20, 2013 at 8:29 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I'm not wedded to using *actual* pth files as a cross-platform linking solution - a more limited format that only supported path additions, without the extra powers of pth files would be fine. The key point is to use the .dist-info directories to bridge between "unversioned installs in site packages" and "finding parallel versions at runtime without side effects on all Python applications executed on that system" (which is the problem with using a pth file in site packages to bootstrap the parallel versioning system as easy_install does).
So why not just make a new '.pth-info' file or directory dropped into a sys.path directory for this purpose? Reusing .dist-info as an available package (vs. an *importable* package) looks like a bad idea from a compatibility point of view. (For example, it's immediately incompatible with Distribute, which would interpret the redundant .dist-info as being importable from that directory.)
If a distribution has been installed in site-packages (or has an appropriate *.pth file there), there won't be any *.pth file in the .dist-info directory.
Right, but if this were the protocol, you wouldn't tell what's *already on sys.path* without reading all those .dist-info directories to see if they *had* .pth files. You'd have to look for the ones that were missing a .pth file, in other words, in order to know which of those .dist-info's represented a package that was actually importable from that directory.
The *.pth file will only be present if the package has been installed *somewhere else*.
...which is precisely the thing that makes it incompatible with PEP 376 (and Distribute ATM). ;-)
However, it occurs to me that we can do this differently, by explicitly involving a separate directory that *isn't* on sys.path by default, and use a path hook to indicate when it should be accessed.
Why not just put a .pth-info file that points to the other location, or whatever? Then it's still discoverable, but you don't have to open it unless you intend to add it to sys.path (or an import hook or whatever). If it needs to list a bunch of different directories in it, or whatever, doesn't matter. The point is, using a file in the *same* sys.path directory saves a metric tonne of complexity in sys.path management. Plus, you get the available packages in a single directory read, and you can open whatever files you need in order to pick up additional information in the case of needing a non-default package.
Under this version of the proposal, PEP 376 would remain unchanged, and would effectively become the "database of installed distributions available on sys.path by default".
That's what it is *now*. Or more precisely, it's a directory of packages that would be importable if a given directory is present on sys.path. It doesn't say anything about sys.path as a whole.
- rather than the contents being processed directly from sys.path, we would add a "<versioned-packages>" entry to sys.path with a path hook that maps to a custom module finder that handles the extra import locations without the same issues as the current approach to modifying sys.path in pkg_resources (which allows shadowing development versions with installed versions by inserting at the front), or the opposite problem that would be created by appending to the end (allowing default versions to shadow explicitly requested versions)
Note that you can do this without needing a separate sys.path entry. You can give alternate versions whatever precedence they *would* have had, by replacing the finder for the relevant directory. But it would be better if you could be clearer about what precedence you want these other packages to have, relative to the matching sys.path entries. You seem to be speaking in terms of a single site-packages and single versioned-packages directory, but applications and users can have more complicated paths than that. For example, how do PYTHONPATH directories factor into this? User-site packages? Application plugin directories? Will all of these need their own markers? That's why I think we should focus on *individual* directories (the way PEP 376 does), rather than trying to define an overall precedence system. While there are some challenges with easy_install.pth, the basic precedence concept it uses is sound: an encapsulated package discovered in a given directory takes precedence over unencapsulated packages in the same directory. The place where easy_install falls down is in the implementation: not only does it have to munge sys.path in order to insert those non-defaults, it also installs *everything* in an encapsulated form, making a huge sys.path. But you can take the same basic idea and apply it to an import hook; I just think that rather than having the extra directory, it's less coupling and complexity if we look at the level of directories rather than sys.path as a whole. This still lets a system installer put stuff wherever it wants; it just has to also write a .pth-info (or whatever you want to call it) file in site-packages, telling Python where to find it. It also lets plugin-oriented systems use the same approach, and PYTHONPATH, and user-site packages, etc., venvs, etc. all work in exactly the same way, without needing to reinvent wheels or share a single (and privileged) hook.
The versioned import hook would work just like normal sys.path based import (i.e. maintaining a sequence of path entries, using sys.modules, sys.path_hooks, sys.path_importer_cache), the only difference is that the set of paths it checks would initially be empty. Calls to the new API in distlib would modify the *versioned* path, effectively inserting all those paths at the point in sys.path where the "<versioned-packages>" marker is placed, rather than appending them to the beginning or end. The API that updates the paths handled by the versioned import hook would also take care of detecting and complaining about incompatible version constraints.
How does this interact with an application that uses both system-installed packages and a user-supplied plugin directory? (This also sounds like a recipe for new breakage and debug issues caused by putting the marker in the wrong place.)

----- Original Message -----
pkg_resources.requires() is our only current solution for parallel installation of incompatible versions. This can be made to work and is a lot better than the nothing we had before it was created, but also has quite a few issues (and it can be a nightmare to debug when it goes wrong).
Based on the exchanges with Mark McLoughlin the other week, and chatting to Matthias Klose here at the PyCon US sprints, I think I have a design that will let us support parallel installs in a way that builds on existing standards, while behaving more consistently in edge cases and without making sys.path ridiculously long even in systems with large numbers of potentially incompatible dependencies.
The core of this proposal is to create an updated version of the installation database format that defines semantics for *.pth files inside .dist-info directories.
Specifically, whereas *.pth files directly in site-packages are processed automatically when Python starts up, those inside dist-info directories would be processed only when explicitly requested (probably through a new distlib API). The processing of the *.pth file would insert it into the path immediately before the path entry containing the .dist-info directory (this is to avoid an issue with the pkg_resources insert-at-the-front-of-sys.path behaviour where system packages can end up shadowing those from a local source checkout, without running into the issue with append-to-the-end-of-sys.path where a specifically requested version is shadowed by a globally installed version)
To use CherryPy2 and CherryPy3 on Fedora as an example, what this would allow is for CherryPy3 to be installed normally (i.e. directly in site-packages), while CherryPy2 would be installed as a split install, with the .dist-info going into site-packages and the actual package going somewhere else (more on that below). A cherrypy2.pth file inside the dist-info directory would reference the external location where cherrypy 2.x can be found.
To use this at runtime, you would do something like:
distlib.some_new_requires_api("CherryPy (2.2)") import cherrypy
So what would be done when CherryPy 4 came? CherryPy 3 is installed directly in site-packages, so version 2 and 4 would be treated with split-install? It seems to me that this type of special casing is not what we want. If you develop on one machine and deploy on another machine, you have no guarantee that the standard installation of CherryPy is the same as on your system. That would force developers to actually always install their used versions by "split-install", so that they could make sure they always import the correct version. At this point, I will go to the Ruby world for example (please don't shout at me :). If you look at how RubyGems work, they put _every_ Gem in a versioned directory (therefore no special casing). When just "require 'foo'" is used, newest "foo" is imported, otherwise a specific version is imported if specified. I believe that we should head a similar way here, making the "split-install" the default (and the only way). Then if user uses standard
import cherrypy
Python would import the newest version. When using
distlib.some_new_requires_api("CherryPy (2.2)") import cherrypy
Python would import the specific version. This may actually turn out to be very useful, as you could place all the distlib calls into __init__.py of your package which would nicely separate this from the actual code (and we wouldn't need anything like Ruby Gemfiles). So am I completely wrong here or does this make sense to you? Slavek.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
-- Regards, Bohuslav "Slavek" Kabrda.

On Wed, Mar 20, 2013 at 1:01 AM, Bohuslav Kabrda <bkabrda@redhat.com> wrote:
So what would be done when CherryPy 4 came? CherryPy 3 is installed directly in site-packages, so version 2 and 4 would be treated with split-install? It seems to me that this type of special casing is not what we want. If you develop on one machine and deploy on another machine, you have no guarantee that the standard installation of CherryPy is the same as on your system. That would force developers to actually always install their used versions by "split-install", so that they could make sure they always import the correct version.
This approach isn't viable, as it is both backwards incompatible with the expectations of current Python software and incompatible with the requirements of Linux distros and other system integrators (who need to be able to add new backwards incompatible versions of software without changing the default version). And I definitely won't shout at people for mentioning what other languages do - learning from what works and what doesn't for other groups is exactly what we *should* be doing. Many of the features in the forthcoming metadata 2.0 specification are driven by stealing things that are known to work from Node.js, Perl, Ruby, PHP, RPM, DEB, etc. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

----- Original Message -----
On Wed, Mar 20, 2013 at 1:01 AM, Bohuslav Kabrda <bkabrda@redhat.com> wrote:
So what would be done when CherryPy 4 came? CherryPy 3 is installed directly in site-packages, so version 2 and 4 would be treated with split-install? It seems to me that this type of special casing is not what we want. If you develop on one machine and deploy on another machine, you have no guarantee that the standard installation of CherryPy is the same as on your system. That would force developers to actually always install their used versions by "split-install", so that they could make sure they always import the correct version.
This approach isn't viable, as it is both backwards incompatible with the expectations of current Python software and incompatible with the requirements of Linux distros and other system integrators (who need to be able to add new backwards incompatible versions of software without changing the default version).
Yep, it's backwards incompatible, sure. I think your proposal is a step in the right direction. My proposal is where I think we should be heading in the long term (and do the big step of breaking the backward compatibility as a part of some other huge step, like Python2->Python3 transition was). As for Linux distros, that's not an issue AFAICS. We've been doing the same with Ruby for quite some time and it works (yes, with some patching here and there, but generally it does). Fact is that this system brings lots of benefits to developers. I'm actually quite schizophrenic in this regard, as I'm both packager and developer :) and I see how these worlds collide in these matters. From the packager point of view I see your point, from the developer point of view I install CherryPy 4, import CherryPy and then find out that I'm still using version 3, which breaks my developer expectations.
And I definitely won't shout at people for mentioning what other languages do - learning from what works and what doesn't for other groups is exactly what we *should* be doing. Many of the features in the forthcoming metadata 2.0 specification are driven by stealing things that are known to work from Node.js, Perl, Ruby, PHP, RPM, DEB, etc.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
-- Regards, Bohuslav "Slavek" Kabrda.

Not sure how you could do a good job having one version of a package available by default, and a different one available by requires(). Eggs list the top level packages provided and you could shadow them but it seems like it would be really messy. Ruby Gems appear to have a directory full of gems: ~/.gem/ruby/1.8/gems/. Each subdirectory is {name}-{version} and doesn't need any suffix - we know what they are because of where they are. bundler-1.2.1 json-1.7.5 sinatra-1.3.3 tilt-1.3.3 tzinfo-0.3.33 Each subdirectory contains metadata, and a lib/ directory that would actually be added to the Ruby module path. Like with pkg_resources, developers are warned to only "require Gems" on things that are *not* imported (preferably in the equivalent of our console_scripts wrappers). Otherwise you get an unwanted Gem dependency if you ever tried to use the same gem outside of the gem system.

----- Original Message -----
Not sure how you could do a good job having one version of a package available by default, and a different one available by requires(). Eggs list the top level packages provided and you could shadow them but it seems like it would be really messy.
Yup, it'd require decent amount of changes and probably break some backwards compatibility as mentioned.
Ruby Gems appear to have a directory full of gems: ~/.gem/ruby/1.8/gems/. Each subdirectory is {name}-{version} and doesn't need any suffix - we know what they are because of where they are.
bundler-1.2.1 json-1.7.5 sinatra-1.3.3 tilt-1.3.3 tzinfo-0.3.33
Each subdirectory contains metadata, and a lib/ directory that would actually be added to the Ruby module path.
Not exactly. the 1.8 directory contains gems/ and specifications/. The specifications/ directory contain {name}-{version}.gemspec, which is a meta-information holder for the specific gem. Among other things, it contains require_paths, that are concatenated with gems/{name}-{versions} to get the load path. So the rubygems require first looks at the list of specs, then chooses the proper one (newest when no version is specified or the specified one) and then computes the load path from it.
Like with pkg_resources, developers are warned to only "require Gems" on things that are *not* imported (preferably in the equivalent of our console_scripts wrappers). Otherwise you get an unwanted Gem dependency if you ever tried to use the same gem outside of the gem system.
I don't really know what you mean by this - could you please reword it? -- Regards, Bohuslav "Slavek" Kabrda.

Like with pkg_resources, developers are warned to only "require Gems" on things that are *not* imported (preferably in the equivalent of our console_scripts wrappers). Otherwise you get an unwanted Gem dependency if you ever tried to use the same gem outside of the gem system.
I don't really know what you mean by this - could you please reword it?
There should be only one call to the linker, at the very top of execution. Otherwise in this pseudo-language example you can't use foobar without also using the requires system: myscript: requires(a, b, c) import foobar run() foobar: requires(c, d) # No!

On Wed, Mar 20, 2013 at 6:58 AM, Daniel Holth <dholth@gmail.com> wrote:
Like with pkg_resources, developers are warned to only "require Gems" on things that are *not* imported (preferably in the equivalent of our console_scripts wrappers). Otherwise you get an unwanted Gem dependency if you ever tried to use the same gem outside of the gem system.
I don't really know what you mean by this - could you please reword it?
There should be only one call to the linker, at the very top of execution. Otherwise in this pseudo-language example you can't use foobar without also using the requires system:
myscript: requires(a, b, c) import foobar run()
foobar: requires(c, d) # No!
RIght, version control and runtime access should be separate steps. In a virtual environment, you shouldn't need runtime checks at all - all the version compatibility checks should be carried out when creating the environment. Similarly, when a distro defines their site-packages contents, they're creating an integrated set of interlocking requirements, all designed to work together. Only when they need multiple mutually incompatible versions installed should the versioning system be needed. Assuming we go this way, distros will presumably install system Python packages into the versioned layout and then symlink them appropriately from the "available by default" layout in site-packages. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Mar 20, 2013 at 10:35 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Assuming we go this way, distros will presumably install system Python packages into the versioned layout and then symlink them appropriately from the "available by default" layout in site-packages.
If they're going to do that, then why not put the versioned layout directly into site-packages in the first place?
participants (5)
-
Bohuslav Kabrda
-
Daniel Holth
-
Jim Fulton
-
Nick Coghlan
-
PJ Eby