On Wed, Mar 20, 2013 at 8:29 AM, Nick Coghlan
I'm not wedded to using *actual* pth files as a cross-platform linking solution - a more limited format that only supported path additions, without the extra powers of pth files would be fine. The key point is to use the .dist-info directories to bridge between "unversioned installs in site packages" and "finding parallel versions at runtime without side effects on all Python applications executed on that system" (which is the problem with using a pth file in site packages to bootstrap the parallel versioning system as easy_install does).
So why not just make a new '.pth-info' file or directory dropped into a sys.path directory for this purpose? Reusing .dist-info as an available package (vs. an *importable* package) looks like a bad idea from a compatibility point of view. (For example, it's immediately incompatible with Distribute, which would interpret the redundant .dist-info as being importable from that directory.)
If a distribution has been installed in site-packages (or has an appropriate *.pth file there), there won't be any *.pth file in the .dist-info directory.
Right, but if this were the protocol, you wouldn't tell what's *already on sys.path* without reading all those .dist-info directories to see if they *had* .pth files. You'd have to look for the ones that were missing a .pth file, in other words, in order to know which of those .dist-info's represented a package that was actually importable from that directory.
The *.pth file will only be present if the package has been installed *somewhere else*.
...which is precisely the thing that makes it incompatible with PEP 376 (and Distribute ATM). ;-)
However, it occurs to me that we can do this differently, by explicitly involving a separate directory that *isn't* on sys.path by default, and use a path hook to indicate when it should be accessed.
Why not just put a .pth-info file that points to the other location, or whatever? Then it's still discoverable, but you don't have to open it unless you intend to add it to sys.path (or an import hook or whatever). If it needs to list a bunch of different directories in it, or whatever, doesn't matter. The point is, using a file in the *same* sys.path directory saves a metric tonne of complexity in sys.path management. Plus, you get the available packages in a single directory read, and you can open whatever files you need in order to pick up additional information in the case of needing a non-default package.
Under this version of the proposal, PEP 376 would remain unchanged, and would effectively become the "database of installed distributions available on sys.path by default".
That's what it is *now*. Or more precisely, it's a directory of packages that would be importable if a given directory is present on sys.path. It doesn't say anything about sys.path as a whole.
- rather than the contents being processed directly from sys.path, we would add a "<versioned-packages>" entry to sys.path with a path hook that maps to a custom module finder that handles the extra import locations without the same issues as the current approach to modifying sys.path in pkg_resources (which allows shadowing development versions with installed versions by inserting at the front), or the opposite problem that would be created by appending to the end (allowing default versions to shadow explicitly requested versions)
Note that you can do this without needing a separate sys.path entry. You can give alternate versions whatever precedence they *would* have had, by replacing the finder for the relevant directory. But it would be better if you could be clearer about what precedence you want these other packages to have, relative to the matching sys.path entries. You seem to be speaking in terms of a single site-packages and single versioned-packages directory, but applications and users can have more complicated paths than that. For example, how do PYTHONPATH directories factor into this? User-site packages? Application plugin directories? Will all of these need their own markers? That's why I think we should focus on *individual* directories (the way PEP 376 does), rather than trying to define an overall precedence system. While there are some challenges with easy_install.pth, the basic precedence concept it uses is sound: an encapsulated package discovered in a given directory takes precedence over unencapsulated packages in the same directory. The place where easy_install falls down is in the implementation: not only does it have to munge sys.path in order to insert those non-defaults, it also installs *everything* in an encapsulated form, making a huge sys.path. But you can take the same basic idea and apply it to an import hook; I just think that rather than having the extra directory, it's less coupling and complexity if we look at the level of directories rather than sys.path as a whole. This still lets a system installer put stuff wherever it wants; it just has to also write a .pth-info (or whatever you want to call it) file in site-packages, telling Python where to find it. It also lets plugin-oriented systems use the same approach, and PYTHONPATH, and user-site packages, etc., venvs, etc. all work in exactly the same way, without needing to reinvent wheels or share a single (and privileged) hook.
The versioned import hook would work just like normal sys.path based import (i.e. maintaining a sequence of path entries, using sys.modules, sys.path_hooks, sys.path_importer_cache), the only difference is that the set of paths it checks would initially be empty. Calls to the new API in distlib would modify the *versioned* path, effectively inserting all those paths at the point in sys.path where the "<versioned-packages>" marker is placed, rather than appending them to the beginning or end. The API that updates the paths handled by the versioned import hook would also take care of detecting and complaining about incompatible version constraints.
How does this interact with an application that uses both system-installed packages and a user-supplied plugin directory? (This also sounds like a recipe for new breakage and debug issues caused by putting the marker in the wrong place.)