On Tue, Mar 19, 2013 at 11:06 AM, PJ Eby <pje@telecommunity.com> wrote:
Could you perhaps spell out why this is better than just dropping .whl files (or unpacked directories) into site-packages or equivalent?
I need a solution that will also work for packages installed by the system installer - in fact, that's the primary use case. For self-contained installation independent of the system Python, people should be using venv/virtualenv, zc.buildout, software collections (a Fedora/RHEL tool in the same space), or a similar "isolated application" solution. System packages will be spread out according to the FHS, and need to work relatively consistently for every language the OS supports (i.e. all of them), so long term solutions that assume the use of Python-specific bundling formats for the actual installation are not sufficient in my view. I also want to create a database of parallel installed versions that can be used to avoid duplication across virtual environments and software collections by using .pth files to reference a common installed version rather than having to use symlinks or copies of the files. I'm not wedded to using *actual* pth files as a cross-platform linking solution - a more limited format that only supported path additions, without the extra powers of pth files would be fine. The key point is to use the .dist-info directories to bridge between "unversioned installs in site packages" and "finding parallel versions at runtime without side effects on all Python applications executed on that system" (which is the problem with using a pth file in site packages to bootstrap the parallel versioning system as easy_install does).
Also, one thing that actually confuses me about this proposal is that it sounds like you are saying you'd have two CherryPy.dist-info directories in site-packages, which sounds broken to me; the whole point of the existing protocol for .dist-info was that it allowed you to determine the importable versions from a single listdir(). Your approach would break that feature, because you'd have to:
1. Read each .dist-info directory to find .pth files 2. Open and read all the .pth files 3. Compare the .pth file contents with sys.path to find out what is actually *on* sys.path
If a distribution has been installed in site-packages (or has an appropriate *.pth file there), there won't be any *.pth file in the .dist-info directory. The *.pth file will only be present if the package has been installed *somewhere else*. However, it occurs to me that we can do this differently, by explicitly involving a separate directory that *isn't* on sys.path by default, and use a path hook to indicate when it should be accessed. Under this version of the proposal, PEP 376 would remain unchanged, and would effectively become the "database of installed distributions available on sys.path by default". These files would all remain available by default, preserving backwards compatibility for the vast majority of existing software that doesn't use any kind of parallel install system. We could then introduce a separate "database of all installed distributions". Let's use the "versioned-packages" name, and assume it lives adjacent to the existing "site-packages". The difference between this versioned-packages directory and site-packages would be that: - it would never be added to sys.path itself - multiple .dist-info directories for different versions of the same distribution may be present - distributions are installed into named-and-versioned subdirectories rather than directly into versioned-packages - rather than the contents being processed directly from sys.path, we would add a "<versioned-packages>" entry to sys.path with a path hook that maps to a custom module finder that handles the extra import locations without the same issues as the current approach to modifying sys.path in pkg_resources (which allows shadowing development versions with installed versions by inserting at the front), or the opposite problem that would be created by appending to the end (allowing default versions to shadow explicitly requested versions) We would then add some new version constraint API in distlib to: 1. Check the PEP 376 db. If the version identified there satisfies the constraint, fine, we leave the import state unmodified. 2. If no suitable version is found, check the new versioned-packages directory. 3. If a suitable parallel installed version is found, we check its dist-info directory for the details needed to update the set of paths processed by the versioned import hook. The versioned import hook would work just like normal sys.path based import (i.e. maintaining a sequence of path entries, using sys.modules, sys.path_hooks, sys.path_importer_cache), the only difference is that the set of paths it checks would initially be empty. Calls to the new API in distlib would modify the *versioned* path, effectively inserting all those paths at the point in sys.path where the "<versioned-packages>" marker is placed, rather than appending them to the beginning or end. The API that updates the paths handled by the versioned import hook would also take care of detecting and complaining about incompatible version constraints. It may even be possible to update pkg_resources.requires() to work this way, potentially avoiding the need for the easy_install.pth file that has side effects on applications that don't even use pkg_resources. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia