[Python-Dev] New relative import issue

Josiah Carlson jcarlson at uci.edu
Sat Sep 23 02:03:45 CEST 2006


"Phillip J. Eby" <pje at telecommunity.com> wrote:
> At 12:42 PM 9/22/2006 -0700, Josiah Carlson wrote:
[snip]
> Measure it.  Be sure to include the time to import SQLite vs. the time to 
> import the zipimport module.
[snip]
> Again, seriously, compare this against a zipfile.  You'll find that there's 
> absolutely no comparison between reading this and reading a zipfile central 
> directory -- which also results in an in-memory cache that can then be used 
> to seek() directly to the module.

They are not directly comparable.  The registry of packages can do more
than zipimport in terms of package naming and hierarchy, but it's not an
importer; it's a conceptual replacement of sys.path.  I have already
stated that the actual imports from this registry won't be any faster,
as it will still need to read modules/packages from disk *after* it has
decided on a list of paths to check for the package/module.  Further,
whether we use SQLite, or any one of a number of other persistance
mechanisms, such a choice should depend on a few things (speed being one
of them, though maybe not the *only* consideration).  Perhaps even a zip
file whose 'files' are named with the desired package hierarchy, and
whose contents are something like:

    import imp
    globals.update(imp.load_XXX(...).__dict__)
    del imp


> >Actually, I'm offering a way of *registering* a package with the
> >repository from the command line.  I'm of the opinion that setting the
> >environment via command line for the subsequent Python runs is a bad
> >idea, but then again, I have been using wxPython's wxversion method for
> >a while to select which wxPython installation I want to use, and find
> >things like:
> >
> >     import wxversion
> >     wxversion.ensureMinimal('2.6-unicode', optionsRequired=True)
> >
> >To be exactly the amount of control I want, where I want it.
> 
> Well, that's already easy to do for arbitrary packages and arbitrary 
> versions with setuptools.  Eggs installed in "multi-version" mode are added 
> to sys.path at runtime if/when they are requested.

Why do we have to use eggs or setuptools to get a feature that
*arguably* should have existed a decade ago in core Python?

The core functionality I'm talking about is:

    packages.register(name, path, env=None, system=False, persist=False)
    #system==True implies persist==True

    packages.copy_env(fr_env, to_env)
    packages.use_env(env)

    packages.check(name, version=None)

    packages.use(name, version)

With those 5 functions and a few tricks, we can replace all user-level .pth
and PYTHONPATH use, and sys.path manipulation done in other 3rd party
packages (setuptools, etc.) are easily handled and supported.


> >With a package registry (perhaps as I have been describing, perhaps
> >something different), all of the disparate ways of choosing a version of
> >a library during import can be removed in favor of a single mechanism.
> >This single mechanism could handle things like the wxPython
> >'ensureMinimal', perhaps even 'ensure exact' or 'use latest'.
> 
> This discussion is mostly making me realize that sys.path is exactly the 
> right thing to have, and that the only thing that actually need fixing is 
> universal .pth support, and maybe some utility functions for better 
> sys.path manipulation within .pth files.  I suggest that there is no way an 
> arbitrary "registry" implementation is going to be faster than reading 
> lines from a text file.
> 
> > > Setuptools works around this by installing an enhancement for the 'site'
> > > module that extends .pth support to include all PYTHONPATH
> > > directories.  The enhancement delegates to the original site module after
> > > recording data about sys.path that the site module destroys at startup.
> >
> >But wasn't there a recent discussion describing how keeping persistant
> >environment variables is a PITA both during install and runtime?
> 
> Yes, exactly.

You have confused me, because not only have you just said "we use
PYTHONPATH as a solution", but you have just acknowledged that using
PYTHONPATH is not reasonable as a solution.  You have also just said
that we need to add features to .pth support so that it is more usable.

So, sys.path "is exactly the right thing to have", but we need to add
more features to make it better.

Ok, here's a sample .pth file if we are willing to make it better (in my
opinion):

    zope,/path/to/zope,3.2.1,netserver
    zope.subpackage,/path/to/subpackage,.1.1,netserver

That's a CSV file with rows defining packages, and columns in order:
package name, path to package, version, and a semicolon-separated list
of environments that this package is available in (a leading semicolon,
or a double semicolon says that it is available when no environment is
specified).

With a base sys.path, a dictionary of environment -> packages created
from .pth files, and a simple function, one can generally develop an
applicable sys.path on demand to some choose_environment() call.

This is, effectively, a variant of what I was suggesting, only with
a different persistance representation.


> >Extending .pth files to PYTHONPATH seems to me like a hack meant to work
> >around the fact that Python doesn't have a package registry.  And really,
> >all of the current sys.path + .pth + PYTHONPATH stuff could be subsumed
> >into a *single* mechanism.
> 
> Sure -- I suggest that the single mechanism is none other than 
> *sys.path*.  The .pth files, PYTHONPATH, and a new command-line option 
> merely being ways to set it.

I guess we disagree on what is meant by "single" in this context.


> All of the discussion that's taken place here has sufficed at this point to 
> convince me that sys.path isn't broken at all, and doesn't need 
> fixing.  Some tweaks to 'site' and maybe a new command-line option will 
> suffice to clean everything up quite nicely.
> 
> I say this because all of the version and dependency management things that 
> people are asking about can already be achieved by setuptools, so clearly 
> the underlying machinery is fine.  It wasn't until this message of yours 
> that I realized that you are trying to solve a bunch of problems that are 
> quite solvable within the existing machinery.  I was mainly interested in 
> cleaning up the final awkwardness that's effectively caused by lack of .pth 
> support for the startup script directory.

Indeed, everything is solvable within the existing machinery.  But it's
not a question of solvable, it's a question of can we make things better. 
When I have had the occasion to use .pth files, I've been somewhat
disappointed.  Given even the few functions I've defined for an API, or
the .pth variant I described, I know I wouldn't be disappointed in
trying to set up independant package version installations, application
environments, etc.  They all come fairly naturally.


> > > I'm not sure of that, since I don't yet know how your approach would deal
> > > with namespace packages, which are distributed in pieces and assembled
> > > later.  For example, many PEAK and Zope distributions live in the peak.*
> > > and zope.* package namespaces, but are installed separately, and glued
> > > together via __path__ changes (see the pkgutil docs).
> >
> >     packages.register('zope', '/path/to/zope')
> >
> >And if the installation path is different:
> >
> >     packages.register('zope.subpackage', '/different/path/to/subpackage/')
> >
> >Otherwise the importer will know where the zope (or peak) package exists
> >in the filesystem (or otherwise), and search it whenever 'from zope
> >import ...' is performed.
> 
> If you're talking about replacing the current import machinery, you would 
> have to leave this to Py3K, otherwise all you've done is add a *new* import 
> hook, i.e. a "sys.package_loaders" dictionary or some such.

It could coexist happily next to sys.path-based machinery, and it is
likely easier for it to do so (replacing the sys.path bits in the core
language is more work than I would be willing to do).


> If you wanted something like that now, of course, you could slap an 
> importer into sys.meta_path that then did a lookup in 
> sys.package_loaders.  Getting this mechanism bootstrapped, however, is left 
> as an exercise for the reader.  ;)

I just about cry every time I think about adding an import hook.  If
others think that this functionality has legs to stand on, I may just
have to get help from experienced users.


> Note, by the way, that it might be quite possible to do away with 
> everything but sys.meta_path in Py3K, prepopulated with such an importer 
> (along with ones to support builtin and frozen modules).  You could then 
> import a backward-compatibility module that would add support for sys.path 
> and for package __path__ attributes, by adding a new entry to 
> sys.meta_path.  But this is strictly a pipe dream where Python 2.x is 
> concerned.

Indeed, actually removing sys.path from 2.x is a non-starter.  But
replacing user-level modifications of sys.path with calls to a registry? 
That seems possible, if not desireable, from a "let us not monkey patch
the Python runtime" perspective.


 - Josiah



More information about the Python-Dev mailing list