[Distutils] Working environment

Jim Fulton jim at zope.com
Sat Mar 11 16:47:07 CET 2006


Ian Bicking wrote:
> So, coming back to the idea of a working environment, an isolated and 
> more-or-less self-contained environment for holding installed packages. 
>   Sorry if this is a little scattered.  I'm just summarizing my thoughts 
> and the open issues I see, in no particular order (because I'm not sure 
> what order to approach these things in).
> 
> I'm assuming such an environment will be encapsulated in a single 
> directory, looking something like:
> 
> env/
>    bin/
>    lib/python2.4/
>    src/
>    conf/
> 
> The conf/ directory doesn't really relate to much of this specifically, 
> but in many situations it would be useful.  Depending on the situation, 
> other subdirectories may exist.

OK, I'll pretend that it doesn't exist. ;) Actually, I'll have a bit to say
about configuration below.

> Each of the scripts in bin/ should know what their working environment 
> is.  This is slightly tricky, depending on what that means.  If it is a 
> totally isolated environment -- no site-packages on sys.path -- then I 
> feel like the script wrappers have to be shell scripts, to invoke Python 
> with -S (which is hard to do portably on the #! line).

I'll note that this isn't important to me at all.  I'm not opposed
to allowing it, but I don't need it.

I never use the python that comes with my system. I always
build Python from source and I can therefore keep that install
clean, complete, and relatively predictable.  (I am often
annoyed by Python's *implicit* decision of which extension
modules to install. Hopefully, someday, the presense of a
packaging system will enable saner approaches to optional
extensions.) In fact, I usually have multiple versions of
Python available.  I then will have many working environments
that use these custtom Pythons. (The custom Python's are not part
of any working environment.)

Similarly, in production environments, we install custom builds
of Python that our applications use.

...

> lib/python2.4/ is for packages. 

Minor note: this needs to be flexible.  I'd be more inclined to go
with something shallower and simpler, like just "lib",


 > I'm almost inclined to say that
> --single-version-externally-managed makes sense on some level, with a 
> record kept in some standard place (lib/python2.4/install-record.txt?) 
> -- but I'm basically indifferent.  I at least don't see a need for 
> multiple simultaneous versions in this setup, and multiple versions do 
> lead to confusion.  Version metadata is still nice, of course.

Increasingly, projects I work on involve multiple Python applications
with each application requiring different sets of packages and, sometimes,
different package versions.  Basically, I want a single version for each
application, but multiple versions in the environment.  What appeals to me
is a compromise between the single-version and multi-version install.

With a single-version install, you end up with an easy-install.pth
that names the packages (and package versions) being used.  This has the
advantage of being deterministic.  You always get the packages listed
there.  This affects all applications running in the environment.

In a multi-version install, multiple versions are stored in the
environment.  Different applications can use different versions.
When an application is run, a root package is specified in a script
(the package containing the entry point) and pkg_resources tries to find
a suitable combination of packages that satisfy the closure of the root
package's dependencies.  This process is somewhat non-deterministic.
Installation of a new package could change the set of packages used
beyond that individual package, causing implicit and unintended "upgrades"
that you refered to in your talk at PyCon.

What I want (or think I want :) is per-application (per-script) .pth
files.  When I install a script, I want the packages needed by the script
to be determined at that time and written to a script-specific .pth file.
So, for example, if I install a script "foo" (foo.exe, whatever), I also
get foo.pth and have the generated foo script prepend the contents of that
file to sys.path on startup.  I don't mind if that file is required
to be in the same directory as the script.  I also wouldn't mind if the
contents if the path entries were simply embedded in the script.  If I
want to upgrade the packages being used, I simply update the scripts.
There are veriopus ways that this could be made easy.  Even better,
the upgrade process could optionally make backups of the .pth files
or scripts, allowing easy rollbacks of upgrades.

I think that this approach combines the benefits of single- and multi-version
install.  Different applications in an enevironment can require different
package versions.  The choice of packages for an application is determined at
discrete explicit install times, rather than at run time.

...


> Installation as a whole is an open issue.  Putting in env/setup.cfg with 
> the setting specific to that working environment works to a degree -- 
> easy_install will pick it up if invoked from there.  But that doesn't 
> work with setup.py develop, or setup.py install, or some other 
> scenarios. 

I don't follow this.  It seems to work for us, at least for
setup.py develop.  The main lamosity is depending on the current working
directory.

Personally, I'd like (at least optionally) the generated easy_install
script to include information about the environment that was present
when it was installed.  That is, I'd be happy to see setup.cfg consuled
when easy_install is installed and then ignored thereafter.  If you
wanted to change the environment definition, then simply update setup.cfg
amd reinstall easy_install.

Or, better yet, when installing easy_install, have the generated script
remember the full path of the setup.cfg file used and reread that file
when run, regardless of where easy_install is run from.

This brings me to the topic of configuration.  Today, I write wrapper
scripts "by hand",  I may have some application like Zope, or ZEO
or our test runner that is implemented by a an entry point in a module.
Then there's a wrapper script that imports the module and calls the entry point.
The wrapper script is written (manually or with some custom installation
script) to include the path to be used and configuration data,
which may be the location of a configuration file.  I really like
the fact that easy_install will generate wrapper scripts for me, but
I really need more control over how these scripts are generated to
include *both* path and configuration information.

...

> Another option is a completely new python interpreter bound to the 
> environment.  Basically the virtual-python.py option 
> (http://peak.telecommunity.com/DevCenter/EasyInstall#creating-a-virtual-python). 
>   In this model using env/bin/python indicate the proper environment, 
> and you'd have local installs of *everything* including easy_install. 
> This fixes so many problems without crazy hacks that it strongly appeals 
> to me, especially if we can make it somewhat lighter.  I get this 
> /usr/lib/python2.4.zip on my path, that doesn't usually exist (does it 
> ever get created by default); if we could create that somehow on demand 
> and use such a big-bundle-zip, that seems lighter and faster and nicer. 
>   If we just put .pyc files in it, and those .pyc files refer back to 
> the actual module source (in /usr/lib/python2.4/), then tracebacks 
> should also still work, right?  No actual symlinks either, so it should 
> work on Windows.  I'm not entirely sure where I'm going with this, though.

I think that something much simpler can be made to work.  I think a little
more control over how scripts get generated would go a long way.  (There
is still the question of the interactive interpreter...)

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org


More information about the Distutils-SIG mailing list