[Python-Dev] New relative import issue

Fri Sep 22 18:25:01 CEST 2006

At 12:08 AM 9/22/2006 -0700, Josiah Carlson wrote:
>"Phillip J. Eby" <pje at telecommunity.com> wrote:
> >
> > At 08:44 PM 9/21/2006 -0700, Josiah Carlson wrote:
> > >This can be implemented with a fairly simple package registry, contained
> > >within a (small) SQLite database (which is conveniently shipped in
> > >Python 2.5).  There can be a system-wide database that all users use as
> > >a base, with a user-defined package registry (per user) where the
> > >system-wide packages can be augmented.
> >
> > As far as I can tell, you're ignoring that per-user must *also* be
> > per-version, and per-application.  Each application or runtime environment
> > needs its own private set of information like this.
>
>Having a different database per Python version is not significantly
>different than having a different Python binary for each Python version.

You misunderstood me: I mean that the per-user database must be able to 
store information for *different Python versions*.  Having a single 
per-user database without the ability to include configuration for more 
than one Python version (analagous to the current situation with the 
distutils per-user config file) is problematic.

In truth, a per-user configuration is just a special case of the real need: 
to have per-application environments.  In effect, a per-user environment is 
a fallback for not having an appplication environment, and the system 
environment is a fallback for not having a user environment.

>About the only (annoying) nit is that the systemwide database needs to
>be easily accessable to the Python runtime, and is possibly volatile.
>Maybe a symlink in the same path as the actual Python binary on *nix,
>and the file located next to the binary on Windows.
>
>I didn't mention the following because I thought it would be superfluous,
>but it seems that I should have stated it right out.  My thoughts were
>that on startup, Python would first query the 'system' database, caching
>its results in a dictionary, then query the user's listing, updating the
>dictionary as necessary, then unload the databases.  On demand, when
>code runs packages.register(), if both persist and systemwide are False,
>it just updates the dictionary. If either are true, it opens up and
>updates the relevant database.

Using a database as the primary mechanism for managing import locations 
simply isn't workable.  You might as well suggest that each environment 
consist of a single large zipfile containing the packages in question: this 
would actually be *more* practical (and fast!) in terms of Python startup, 
and is no different from having a database with respect to the need for 
installation and uninstallation to modify a central file!

I'm not proposing we do that -- I'm just pointing out why using an actual 
database isn't really workable, considering that it has all of the 
disadvantages of a big zipfile, and none of the advantages (like speed, 
having code already written that supports it, etc.)

>This is easily remedied with a proper 'packages' implementation:
>
>     python -Mpackages name path
>
>Note that Python could auto-insert standard library and site-packages
>'packages' on startup (creating the initial dictionary, then the
>systemwide, then the user, ...).

I presume here you're suggesting a way to select a runtime environment from 
the command line, which would certainly be a good idea.

> > These are just a few of the issues that come to mind.  Realistically
> > speaking, .pth files are currently the most effective mechanism we have,
> > and there actually isn't much that can be done to improve upon them.
>
>Except that .pth files are only usable in certain (likely) system paths,
>that the user may not have write access to.  There have previously been
>proposals to add support for .pth files in the path of the run .py file,
>but they don't seem to have gotten any support.

Setuptools works around this by installing an enhancement for the 'site' 
module that extends .pth support to include all PYTHONPATH 
directories.  The enhancement delegates to the original site module after 
recording data about sys.path that the site module destroys at startup.

>I believe that most of the concerns that you have brought up can be
>addressed,

Well, as I said, I've already dealt with them, using .pth files, for the 
use cases I care about.  Ian Bicking and Jim Fulton have also gone farther 
with work on tools to create environments with greater isolation or more 
fixed version linkages than what setuptools does.  (Setuptools-generated 
environments dynamically select requirements based on available versions at 
runtime, while Ian and Jim's tools create environments whose inter-package 
linkages are frozen at installation time.)

>and I think that it could be far nicer to deal with than the
>current sys.path hackery.

I'm not sure of that, since I don't yet know how your approach would deal 
with namespace packages, which are distributed in pieces and assembled 
later.  For example, many PEAK and Zope distributions live in the peak.* 
and zope.* package namespaces, but are installed separately, and glued 
together via __path__ changes (see the pkgutil docs).

Thus, if you are talking about a packagename->importer mapping, it has to 
take into consideration the possibility of multiple import locations for 
the same package.

>  The system database location is a bit annoying,
>but I lack the *nix experience to say where such a database could or
>should be located.

This issue is a triviality compared to the more fundamental flaws (or at 
any rate, holes) in what you're currently proposing.  I wouldn't worry 
about it at all right now.

That having been said, I find the discussion stimulating, because I do plan 
to revisit the environments issue in setuptools 0.7, so who knows what 
ideas may come up?