[Distutils] Putting eggs first on sys.path
Phillip J. Eby
pje at telecommunity.com
Sat Sep 24 19:04:07 CEST 2005
One of the things that occasionally creates problems for installing
applications with setuptools, and for certain non-root package installs on
Unix, is the fact that eggs are normally added to the *end* of sys.path,
rather than the beginning.
I did this because I needed to maintain various invariants in
pkg_resources, such as namespace packages' __path__ items needing to match
sys.path order. Also, .pth files add entries to the end of sys.path, so
changing this isn't really an option for EasyInstall-supplied default eggs.
The problems with this are:
1. If you install an application but then later set an incompatible version
of one of its requirements as the default version of that project, then the
application will stop working
2. If you are using a simplistic non-root installation, system-installed
eggs can override or conflict with your personal eggs, and prevent the use
of entry points from them.
So, after some thought, I think I have a way to adjust the existing policy
that will deal with these problems, while still allowing most invariants to
remain intact. It's a little kludgy, so I'm hoping somebody has a better
idea. For EasyInstall-generated wrapper scripts, it's no big deal and is
invisible to the user. For manual use, it seems a little clumsy, though.
The idea is this: when pkg_resources is imported, it will check the
__main__ module for a __requires__ variable. If found, it will do the
equivalent of require()-ing that value, but with sys.path set to an empty
list. It will then restore the old value of sys.path, adding it *after*
the entries added by the require() process. Thus, the very first require()
will insert entries at the start of sys.path, but in a consistent
order. Thus, a script can effectively require package versions that are
not the default. (If you try this currently, you get VersionConflict errors.)
You might now ask, "Why __requires__? Why not just do this for the first
require() call?" Unfortunately, it's not that simple. pkg_resources needs
to export a global "working_set" object that lists the active eggs and
their entry points. Once you've imported pkg_resources, then, this list
needs to be in a consistent state. So it's a bit of a chicken and egg
problem, in that you need pkg_resources imported to do require(), but if
pkg_resources is imported then you need to have already done any require()s
that override existing sys.path entries. Thus, putting a variable in a
common module (e.g., by putting it in the script before importing
pkg_resources) allows us to pass a parameter to something that hasn't been
imported yet.
For use in the interactive interpreter, things are a bit more complex,
because it's possible that you could import something further down on
sys.path before importing pkg_resources, possibly leading to an implicit
conflict of some kind. You can still set __requires__ and import
pkg_resources, but it looks weird to do that, and it's certainly not the
usual way to do a require(), so it seems potentially confusing as well.
Of course, I suppose I could just make it an "undocumented internal
feature" of pkg_resources and setuptools, reserved for
EasyInstall-generated scripts. This probably makes sense in that the
default versions of packages are the ones that you'll nominally be using in
the interactive interpreter. On the other hand, a simple non-root install
won't work if you want to override site-wide EasyInstalled defaults, unless
you do some fancy footwork in a sitecustomize.py or ~/
Maybe I'm just expecting too much, though. Perhaps it's unrealistic to
expect to add new features, be 100% backward compatible, support
everybody's personal quirky directory layout on Unix, AND still not have
any kludgy bits. :)
Thoughts, anyone?
More information about the Distutils-SIG
mailing list