[Distutils] setuptools in a cross-compilation packaging environment

Phillip J. Eby pje at telecommunity.com
Tue Oct 11 18:11:25 CEST 2005


At 04:46 PM 10/11/2005 +0200, M.-A. Lemburg wrote:
>Phillip J. Eby wrote:
> >>I must admit that I haven't followed the discussions about
> >>these .egg-info directories. Is there a good reason not to
> >>use the already existing PKG-INFO files that distutils builds
> >>and which are used by PyPI (aka cheeseshop) ?
> >
> > I don't know if there's such a reason or not, but in any case that's what
> > we use as part of the egg-info directories.  However, we *also* allow for
> > unlimited metadata resources to be provided in egg-info, as this is what
> > allows us to carry things like plugin metadata and scripts in the
> > egg.  There are other metadata files listing the C extensions in the
> > package, the "namespace packages" that the egg participates in, and so on.
> >
> >>Hmm, you seem to be making things unnecessarily complicated.
> >
> > That probably just means you're not familiar with the requirements.
>
>I did read your posting, but still don't understand why you
>need a multitude of meta-data files in a special directory.
>
>PKG-INFO is general and extensible enough to hold all that information,
>IMHO.

And I suppose you have a plan for embedding site.zcml files in PKG-INFO 
too?  How about Trac plugin specifiers?  Paste template definitions?

Eggs are a format to suppport applications and their plugins.  They support 
arbitrary Python projects as well, because that makes it easy for 
applications and their plugins to depend on them.  They are not merely a 
distribution format for installing systemwide Python packages.  We have 
plenty of those formats already.


> > You completely lost me.  A major feature of eggs is that for an 
> application
> > needing plugins, it can simply scan a directory of downloaded eggs and 
> plug
> > them into itself.  Having a required installation mechanism other than
> > "download the egg and put it here" breaks that.
>
>While I don't find a non-managed Python installation
>mechanism a particularly useful goal to have,

It's incredibly useful for application distributors, especially extensible 
applications and app servers like Zope, Trac, Chandler, etc.  They simply 
cannot afford to rely on the system Python or native packaging system to 
meet their requirements or provide a quality user experience.


>you could still
>have the same thing by using and scanning a sub-directory of
>the pythoneggs package directory or directories listed in
>an environment variable PYTHONEGGS as fallback solution (if the
>egg was not found in the database.py module).

This approach doesn't allow any eggs to be on sys.path by default, nor does 
it allow simply importing and using target packages.  The current system 
allows us to create eggs for packages that know nothing about eggs, without 
making any changes to their code.  (We even automatically detect potential 
__file__ manipulation code, and mark such eggs as needing to be installed 
in unzipped form.)


> > And the disadvantage of absolutely requiring install/uninstall steps, 
> which
> > is anathema.
>
>Oops. I disagree on that one. Not only does install/uninstall
>make system administration a whole lot easier,

Eggs are not a system administration tool.  They're for people making 
software and people using it.  If a vendor wants to package eggs for the 
convenience of their users, great.  If not, that's okay, because eggs are 
specifically intended to not require system administrator support.  System 
administrator involvement in this process is a *bug*, not a feature.  Users 
having to beg the sysadmin to get Python packages installed is a Bad 
Thing.  Applications having to rely on what's in site-packages is a Bad 
Thing.  Eggs allow users and applications to manage their own needs, 
independent of what the site or vendor does or doesn't provide.


> >>>>Please make sure that your eggs catch all possible
> >>>>Python binary build dimensions:
> >>>>
> >>>>* Python version
> >>>>* Python Unicode variant (UCS2, UCS4)
> >>>>* OS name
> >>>>* OS version
> >>>>* Platform architecture (e.g. 32-bit vs. 64-bit)
> >
> > Well, my presumption here is that we're going to get the scheme right for
> > Python at large, and make it standard.  Are you saying that some packages
> > should have their own scheme?  That's not really workable since in 
> order to
> > import the package and use its scheme, we would have to first know that 
> the
> > package was compatible!
>
>We're talking about filenames here - they are intended
>to be read and understood by humans, not machines (these
>can use the PKG-INFO data inside the archives or from
>PyPI).

If you read the specification, you'll see that this is not the case.  Eggs 
require machine-parseable filenames, as this allows them to be rapidly 
discovered at runtime for dynamic dependency resolution, with a simple 
listdir().  Unlike your database.py concept or PEP 262, it is impossible 
for the "index" to become out-of-sync with the actual state of the 
installation, because it *is* the current state of the installation.

That said, let's do what we can to get the distutils platform strings to be 
more useful indicators of whether the contained native code can be linked 
and run by a given Python installation.


>That said, yes, the way platforms are setup, it does sometimes
>make it necessary to add extra information to such a filename.
>
>E.g. say you write a plugin for Zope that only works in Zope3
>and not Zope2. Such a plugin would use the "zope3" distinguisher
>in its archive name.

The purpose of including platform information in an egg's filename is to 
avoid attempting to link or run "foreign" native code that might cause a 
hard crash of the Python process.  A Zope 2 vs. 3 distinction would not be 
required as an external designation, since the version dependencies 
declared by the package will either be resolvable or not.


> > We also don't
> > have a UCS flag, but if we did it should be part of the platform string
> > rather than the Python version, since "pure" eggs don't care about the UCS
> > mode, and even if they did, that'd be a requirement of the package rather
> > than the egg itself being platform specific.
>
>This is not correct: unichr(100000) won't work in UCS2
>builds - it will in UCS4 builds, so even though the
>.pyc files run on both builds unchanged, the application
>may very well require the used Python version to be a
>UCS4 build in order to be able to use UCS4 features.

As I said, that would be a requirement of the package, rather than the egg 
itself being platform-specific.  Again, the platform string is just a 
filter to avoid trying to import things that could crash the interpreter 
(as opposed to merely raising an exception).


> >>>A single .pth file is certainly an option, and it's what easy_install
> >>>itself uses.
> >>
> >>Fair enough.
> >>
> >>Could this be enforced and maybe also removed
> >>completely by telling people to add the egg directory to
> >>PYTHONPATH ?
> >
> > If by "egg directory" you mean a single .egg directory (or zipfile) for a
> > particular package, then yes, for that particular package you could do
> > that.  If you mean, can you just put the directory *containing* eggs on
> > PYTHONPATH, then the answer is no, if you want the package to be on
> > sys.path without any special action taken (like calling
> > pkg_resources.require()).
>
>Calling such an API is OK for applications supporting
>eggs. I don't see that as a problem.

"applications supporting eggs" is not the same thing as "people using 
eggs".  People using eggs would like, in the general case, to be able to 
just fire up the Python interpreter and use the packages they've installed, 
without any special steps.  This is especially important for users who are 
simply using the easy_install toolchain to install arbitrary 
distutils-based packages.


> >>Note that the pythonegg package approach would pretty much
> >>remove the need for these .pth files.
> >
> > Only in the sense that it would require reinventing them in a different
> > form.  :)
>
>Not really - but we seem to have different views on whether
>installers are good thing or not, so there's little point in
>argueing over this.

We disagree on whether *requiring* an install step is a good thing.  Good 
installer support is important, which is why EasyInstall can search PyPI, 
and supports download/extract/build/install for most distutils-based 
packages, and handles dependency resolution for setuptools-based 
packages.  Being able to provide good installer support is actually an 
important feature of eggs!

*Requiring* installation, however, is a no-no.  It should be possible to 
ship a Python application by just dumping the application script and a 
bunch of eggs in a single directory.  Having to then "install" those eggs 
somewhere on the target system is a nuisance.

It's also nice that there is no way to "corrupt" your index of eggs except 
by tampering with the eggs themselves.  It's hard to mess it up in some 
unrecoverable way, and everything is simple enough to inspect by hand with 
common tools like 'ls' and 'less' and 'unzip -v'.



More information about the Distutils-SIG mailing list