[Distutils] formencode as .egg in Debian ??

Phillip J. Eby pje at telecommunity.com
Wed Nov 23 00:47:47 CET 2005

At 11:56 PM 11/22/2005 +0100, Martin v. Löwis wrote:
>Phillip J. Eby wrote:
>>>Debian should provide the packages, but not as eggs.
>>For packages that only operate as eggs, and/or require their dependencies 
>>as eggs, you are stating a contradiction in terms.  Eggs are not merely a 
>>distribution format, any more than Java .jar files are.
>So I should say
>"Debian should not provide eggs, period", since what Debian provides
>are packages, and eggs are not?

I don't understand you.

>>>Debian developers should work with upstream authors to keep a
>>>distutils-based setup.py operational.
>>It's perfectly operational; clearly the entire egg system is *well* 
>>within the Python runtime's intended operating parameters, as it uses 
>>only well-defined and published aspects of the Python language, API, 
>>stdlib, and build process.
>I didn't say the egg system in inoperational. I said that distutils
>setup is not operational for, for example, FormEncode: this uses
>another packaging library in setup.py, not distutils setup.

I still don't understand you.  If a package subclasses a distutils command, 
is it no longer a distutils setup?  What if it bundles a library module 
that includes a subclass of a distutils command?  Where, precisely, do you 
draw the line between a "distutils setup" and something else?  *Many* 
packages subclass distutils commands or use unusual arguments to the 
distutils setup() that cause things to be installed in unusual ways.  I'm 
very familiar with this because easy_install tries to support as many of 
those quirky subclasses or arguments as is practical, so it seems to me 
that the definition of a "distutils setup" is nowhere near as clear cut as 
your statements here imply.

>>As I've already stated, applying this same policy to Java libraries would 
>>be to demanding that all the .class files be extracted to the filesystem 
>>and any manifest files be deleted, before Debian would consent to package 
>>them.  In other words, it would be silly and pointless, because the users 
>>would then ignore the packages in favor of actual jars, because then 
>>their applications would actually work.
>This is not the same. A java .jar file is deployed by putting it on disk. 
>For an egg, an (apparently undocumented)

An egg must be on sys.path, if you want to use it without explicitly using 
the egg runtime.  See "The Quick Guide To Python Eggs", in particular this 
passage from http://peak.telecommunity.com/DevCenter/PythonEggs#using-eggs :

    "If you have a pure-Python egg that doesn't use any in-package data 
files, and you don't mind manually placing it on sys.path or PYTHONPATH, 
you can use the egg without installing setuptools."

>number of additional
>steps is necessary, such as editing easy-install.pth.

Nothing except performance considerations prevents you having a separate 
.pth file for each and every egg, just as nothing prevents distutils 
packages from being installed as directory+.pth today.  Does Debian 
currently reject packages that use the extra_path argument to setup(), like 

>In Java, the drawback of course is that each user has to edit
>CLASSPATH to include all the jar files desired. easy_setup
>makes this unnecessary, but in a way unfriendly to dpkg (and
>I assume other Linux package formats).

I don't understand you here.  Are you saying that it's not possible for 
dpkg to do a post-install or uninstall operation like adding or removing a 
line from a file?

In any case, if you look at the approach Ian Bicking suggested and I 
commented further on, you'll see that you *can* in fact bypass this whole 
issue by packaging the egg metadata in another form, that gets rid of the 
need for .egg files or directories, as well as .pth manipulation.

That approach, however, is not significantly documented at this time (other 
than a post to the distutils-SIG earlier this year outlining the design), 
but I'd be more than happy to document it further, if it makes the need for 
the rest of this discussion go away.  :)

Here are the steps to create a "single-version" egg:

1. Build the egg

2. Unzip the egg directly into site-packages, but rename the EGG-INFO 
subdirectory in the process to ProjectName.egg-info, where ProjectName is 
the pkg_resources.safe_name() of the setup(name="...") 
argument.  (Alternately, you can take 
'filename.split("-")[0].replace("_","-")', where 'filename' is the 
os.path.basename of the egg.)

3. (optional) remove any .py/.pyc/.pyo files that have an adjacent C 
extension file of the same name, such as 'foo.py' and 'foo.pyd' or 
'foo.so'.  The .py/.pyc/.pyo are stubs created by setuptools to extract the 
C extension from a zipped egg at runtime, and are not needed by an 
extracted installation.  (This step is optional because Python's import 
gives precedence to the C extensions over the .py files, so nothing bad 
will happen if you don't delete the files.)

What this process will *not* do for you is address conflicts in top-level 
data files, nor will it allow you to deal with packages that are partly 
installed by one package, and partly by another.  For example, Ian's Paste 
package has a 'paste' package that is split across multiple eggs, and I 
expect this to be a popular feature for PEAK and Zope in the future.  The 
'pkgutil' module added in Python 2.3 was added to support namespace 
packages, but if you install the parts of a namespace package in the same 
directory you're going to have to deal with the fact that they all will 
want to install the '__init__.py'.  If you are using this "single-version 
egg" approach, you will probably need to create an extra Debian package to 
hold the __init__.py, and have  the individual packages depend on this package.

Of course, this creates additional work for package maintainers that 
wouldn't be present with setuptools' normal .egg file/directory 
distributions, and my assumption was that the maintainers would prefer to 
be able to ignore such issues and get the benefit of dependencies defined 
by the upstream developers.  Eggs keep each project in its own little 
bubble, where it can't overwrite anything else and can be uninstalled 
without removing any overlapping parts.

More information about the Distutils-SIG mailing list