[Distutils] Egg support for system packages (including bdist_wininst)

Phillip J. Eby pje at telecommunity.com
Fri Dec 23 01:11:04 CET 2005


At 11:22 PM 12/22/2005 +0100, M.-A. Lemburg wrote:
>Phillip J. Eby wrote:
> > At 12:48 PM 12/22/2005 +0100, M.-A. Lemburg wrote:
> >> I'd just wish that this would be the default and the .egg ZIP
> >> file installation approach be made an option.
> >>
> >> setuptools would then finally be compatible to the rest of the
> >> distutils world again and avoid all the added overhead and
> >> problems of ZIP file imports.
> >>
> >> Perhaps you could have two commands, e.g. the default install
> >> would create the normal package directory (with added .egg-info
> >> dir) and a new install_eggfile to install the .egg ZIP file
> >> instead.
> >
> > The only way this could happen is to add a manifest of installed files,
> > in order to allow uninstallation and upgrades to be made safe in the
> > absence of a packaging system.
>
>Sure, why not put this manifest into the .egg-info dir ?!

Because then people will need a special tool to uninstall it.  I'm kind of 
confused because in one email you argue against having easy_install as a 
package manager, and then you seem to be arguing for making more package 
management tools.  Which would you prefer?  :)

Here are the use cases as I see them:

Just unzip an egg and use it:  Sure, just unzip it as a directory with the 
.egg name and add it to sys.path or use require().

Flat layout, short sys.path:  Use a system package manager like bdist_rpm 
or bdist_wininst, or some other bdist wrapper.

No package manager, or other use cases: Use easy_install as your package 
manager.

Looks to me like everything is covered here.


>The advantage of the .egg-info dir is that it's easy
>to find using package introspection.

I don't understand you.  Package != distribution.  .egg-info directories go 
with a project's distribution, and have nothing to do with packages in the 
import sense.  A distribution might contain only a module, for example.  So 
there is no connection between a package and its .egg-info in the sense you 
seem to be implying.


> > The .egg file/directory technique
> > currently makes uninstallation as simple as "rm -rf
> > package-version.egg", and multi-version installs work with no special
> > action.
> >
> > Note, too, that using .egg files or directories means that installation
> > *never breaks existing programs* as long as they have specified their
> > dependencies accurately.  Traditional distutils installation is
> > destructive in the absence of a system packager.
> >
> > However, if you *have* a system packager, then you should not be using
> > "setup.py install" anyway, so I'm not sure I see how this is an issue.
>
>The issue is that "setup.py install" should continue to
>install things the (one-and-only) Python way.

Which is currently broken without a package manager for uninstallation or 
upgrade.  Your argument implies that:

1. Broken things should not be fixed or improved, ever
2. One must never have more than one implementation of an interface

The first one needs no further comment; the second negates the entire 
purpose of having an interface.  (In case it's unclear, I mean that 
"setup.py install" is an interface that simply promises to make the project 
available for use, not how or in what format.)


>  Any different
>way should use a different command name.
>
>You are currently overloading the install command with a
>completely different approach, which is not compatible to
>the standard distutils system or the Python import mechanism.

Please define "compatible" and explain why you believe the above to be 
true, providing adequate technical detail to demonstrate the supposed 
incompatibility.  When you have enough information to do that, you'll see 
why it's not true.

You'll save some time, however, if you first investigate the 'extra_path' 
option to 'setup()' that is provided by the distutils.  It is well within 
the power of a normal "setup.py install" to produce *precisely* the same 
results as a setuptools "setup.py install", and I know this for a fact 
because early versions of setuptools actually *used* it to bootstrap the 
installation of setuptools itself as an egg!

In other words, there is nothing "completely different" going on 
here.  Nothing.  This "non-standard" and "not compatible" nonsense is pure 
FUD, unless of course the distutils are non-standard and not compatible 
with themselves.  :)


> > Anyway, I guess what I'm saying is, if you want traditional
> > distutils-style installation, you should use system packages, since
> > they're the only way to allow uninstallation and avoid corrupted
> > upgrades.  Otherwise, you should use .egg files or directories, in which
> > case easy_install (and in the future, nest) will be your package manager.
>
>I guess what I want to say is that I only want to be forced to
>use the easy_install package manager (or any other
>non-system package manager) if I have a need or requirement
>for the features it offers.

Which is fine: use your system packager.  But people who don't have system 
packagers need a way to handle upgrades and uninstallation safely - and 
that *should* be the default behavior.  I don't give a flying hoot what the 
distutils default behavior is; only what it *should* be, which is to say 
*safe*.  The current default behavior is not safe.  Resorting to "it's the 
standard" as an argument therefore carries zero weight for me.

Nobody should be using unadorned distutils installation, except for the 
Python build process and tools that build system packages or other "bdist" 
formats.  Therefore, setuptools does not provide it as a default behavior.

Yes, setuptools is valuing safety over performance.  If you prefer to be 
unsafe, you'll have to read the docs and use a command line option to get 
it the other way.  If you want to be fast *and* safe, you need a system 
package manager.


>The egg format is not too far away from being a nice drop-in
>format for Python binary extensions.
>
>All it takes is:
>
>* making sure that an unzip will create a
>   proper Python package (with the meta information embedded
>   into it)

It does this now; unzip to a directory with a .egg name and add that 
directory to sys.path.


>With this changes, I think that eggs could actually
>become a prime distribution format - even for system
>extensions since it then no longer interferes with
>the system installer mechanisms.

If this is the *only* thing stopping you from liking eggs, I'll certainly 
consider adding it in 0.7.  There may be some backward compatibility 
issues, in that eggs built with 0.7 would then *not* be usable by older 
versions of the pkg_resources module, which would not know to look for 
.egg-info *inside* a zipped egg.  It seems to me that it would be much 
cleaner to have a script that take an .egg and do the equivalent of 
"install --single-version-externally-managed" with it, than to change the 
format to support using an arbitrary "unzip" utility.  Such an arbitrary 
utility also won't have any way to uninstall or upgrade the files, either.


>Here's an example with 20 eggs (using C shell):
>
>tmp/eggs/> setenv PYTHONPATH
>egg1.egg:egg10.egg:egg11.egg:egg12.egg:egg13.egg:egg14.egg:egg15.egg:egg16.egg:egg17.egg:egg18.egg:egg19.egg:egg2.egg:egg20.egg:egg3.egg:egg4.egg:egg5.egg:egg6.egg:egg7.egg:egg8.egg:egg9.egg
>tmp/eggs> time python -S -c '0'
>0.014u 0.006s 0:00.02 50.0%     0+0k 0+0io 0pf+0w
>tmp/eggs> unsetenv PYTHONPATH
>tmp/eggs> time python -S -c '0'
>0.006u 0.003s 0:00.01 0.0%      0+0k 0+0io 0pf+0w
>
>System time for startup doubles.

If I'm reading this correctly, that's 1/100th of a second to scan 20 
eggs.  If that's a linear progression, one could have *every single project 
listed on PyPI* as an egg on PYTHONPATH simultaneously, with only 54/100ths 
of a second being added to startup.  That actually sounds *really* good to 
me for anything but short scripts!

Note too that setuptools encourages the use of "virtual" Python 
installations for doing software development; this makes it really easy to 
install Python modules for a single project through having its own virtual 
Python.  It's likely that the system Python will have system packages (not 
affecting sys.path length), while user-installed packages will be more 
likely to go in a user-private virtual Python or extended PYTHONPATH 
anyway.  Meanwhile, the short scripts will likely use the system Python, or 
else you'll set up a stripped-down virtual Python just for those scripts if 
you need one.

Since setuptools now supports system packaging, there's no longer any 
reason to treat sys.path length as an argument against using 
setuptools.  The only reason to use easy_install in the first place is for 
things your packaging system can't supply, e.g. because a package is too 
new, or because you're a non-root user who can't install system packages 
anyway, or because you have application instances with different package 
needs, you need isolated Python installations for testing, etc. etc. 
etc.  As a general rule, system packagers don't support *any* of these 
things, so easy_install is really your only practical choice.  That's why 
it exists.



More information about the Distutils-SIG mailing list