[Distutils] setuptools in a cross-compilation packaging environment

Phillip J. Eby pje at telecommunity.com
Wed Oct 5 18:37:44 CEST 2005


At 10:27 AM 10/5/2005 +0200, M.-A. Lemburg wrote:
>[Some comments on your strategy...]
>
>Phillip J. Eby wrote:
> >>The new setuptools is all nice and easy for end user, but as a package
> >>maintainer, I'd like to have the option of building a binary package 
> without
> >>all the dependencies.
> >
> > In the long run, this should be done by packaging the result of bdist_egg,
> > and by default doing bdist_rpm will do this now.  In the short term, 
> unless
> > you're switching to an all-egg distribution, you'll probably want to use
> > legacy/unmanaged mode.
>
>I think you are missing his point here:
>
>As package maintainer you *have* to be able to build a distribution
>package without all the dependency checks being applied - how else
>would you be able to bootstrap the package in case you have circular
>dependencies ?

In legacy/unmanaged mode, setuptools' "install" command behaves the way the 
standard distutils "install" does today, without creating an egg or 
searching for dependencies.


>I don't think that eggs are the solution to everything, so
>you should at least extend the dependency checking code to
>have it detect already installed packages (by trying import
>and looking at __version__ strings) or having an option
>to tell the system: "this dependency is satisfied, trust me".

There are plans to have a feature like that, and in fact setuptools already 
has code to hunt down __version__ strings and the like, without even 
needing to import the packages.  It isn't integrated with the rest of the 
system yet, though.

One reason for that is that early feedback suggests that package developers 
and users would rather have the assurance of having the exact version 
required by something, as long as the installation process doesn't impose 
any additional burden on them.  Local detection hacks have been primarily 
requested by packagers, who (quite reasonably) do not want to have to 
repackage everything as eggs.

There is a simple trick that packagers can use to make their legacy 
packages work as eggs: build .egg-info directories for them in the sys.path 
directory where the package resides, so that the necessary metadata is 
present.  This does not require the use of .pth files, but it does slow 
down the process of package discovery for things that do use pkg_resources 
to locate their dependencies.  It also still requires them to repackage 
existing packages, but doesn't require changing the layout.  Also, such 
packages will currently cause easy_install to warn about conflicting 
packages if you try to install a different version of the same package, but 
this will be alleviated soon, as I'm working on a better conflict 
management mechanism that will allow egg directories on PYTHONPATH to 
override things in the standard directories.  (Currently, eggs are only 
ever added to the end of sys.path, so if the local packaging system puts 
.egg-info directories in site-packages, there would be no way to locally 
override that for an individual user's packages.  A future version of 
setuptools will resolve that issue soon, hopefully in the next few weeks.)

As for eggs being the "solution to everything", I would like to point out 
that what precisely constitutes an egg is an extensible concept.  See e.g.:

     http://mail.python.org/pipermail/distutils-sig/2005-June/004652.html

which shows that there are actually three formats that are "eggs" at the 
moment:

  1. .egg zipfiles
  2. .egg directories
  3. .egg-info marker directories

The key requirements for a format to be a pluggable distribution or "egg" are:

  * Adding it to sys.path must make it importable
  * It must be possible to discover its PyPI project name (and preferably 
version and platform) from the filename
  * It must allow arbitrary data files and directories to be included 
within packages, and allow arbitrary metadata files and directories to be 
included for the project as a whole
  * It must include the standard PKG-INFO metadata

These are the absolute minimums, but there are additional specific metadata 
files and directories that easy_install requires in order to detect 
possible conflicts, create scripts, etc.

Anyway, the point is that what constitutes an "egg" is flexible, but the 
"add to sys.path and make it importable" requirement certainly limits what 
formats are practically meaningful.  Nonetheless, further extensibility is 
certainly possible if there's need.


>Please make sure that your eggs catch all possible
>Python binary build dimensions:
>
>* Python version
>* Python Unicode variant (UCS2, UCS4)
>* OS name
>* OS version
>* Platform architecture (e.g. 32-bit vs. 64-bit)

As far as I know, all of this except the Unicode variant is captured in 
distutils' get_platform().  And if it's not, it should be, since it affects 
any other kind of bdist mechanism.


>and please also make this scheme extendable, so that
>it is easy to add more dimensions should they become
>necessary in the future.

It's extensible by changing the get_platform() and compatible_platform() 
functions in pkg_resources.

By the way, I've issued requests on this list at least twice over the past 
year for people to provide input about how the platform strings should 
work; I got no response to either call, so I gave up.  Later, when an OS X 
upgrade created a compatibility problem, somebody finally chipped in with 
info about what good OS X platform strings might be.  I suspect that 
basically we'll get good platform strings once there are enough people 
encountering problems with the current ones to suggest a better scheme.  :(

If you have suggestions, please make them known, and let's get them into 
the distutils in general, not just our own offshoots thereof.


>To make things easier for the user, the install system
>should be capable of detecting all these dimensions
>and use appropriate defaults when looking for an egg.

That's done for those dimensions currently handled by get_platform(), and 
can be changed by changes to get_platform() and compatible_platforms() in 
pkg_resources.


>Please reconsider your use of .pth files - these cause the
>Python interpreter startup time to increase significantly.
>If you just have one of those files pointing to your
>managed installation path used for eggs, that should
>be fine (although adding that path to PYTHONPATH still
>beats having a .pth to parse everytime the interpreter
>fires up).

EasyInstall uses at most one .pth file, to allow packages to be on the path 
at runtime without needing an explicit 'require()'.  However, a vendor 
creating packages probably doesn't want to have to edit that .pth file, so 
a trivial alternative is to install a .pth for each package.  The tradeoff 
is startup time versus packager convenience in that case.  Having a tool to 
edit a single .pth file would be good, but not all packaging systems have 
the ability to run a program at install or uninstall time.  If they do, 
then editing easy-install.pth to add or remove eggs is a better option.

Eggs can of course be installed in multi-version mode, in which case no 
.pth is necessary, but then an explicit require() or a dependency 
declaration in a setup script is necessary in order to use the package.


>If you however install a .pth file for every
>egg, you'll soon end up with an unreasonable startup time
>which slows down your whole Python installation - including
>applications that don't use setuptools or any of the eggs.

A single .pth file is certainly an option, and it's what easy_install 
itself uses.



More information about the Distutils-SIG mailing list