[Distutils] [Python-Dev] Capsule Summary of Some Packaging/Deployment Technology Concerns

Phillip J. Eby pje at telecommunity.com
Tue Mar 18 01:37:30 CET 2008


At 05:10 PM 3/17/2008 -0500, Jeff Rush wrote:
>I was in a Packaging BoF yesterday and, although not very relevant to the
>packager bootstrap thread, Guido has asked me to post some of the concerns.
>
>The BoF drew about 15 people, many of whom were packagers for Red Hat, Ubuntu
>and such.  Everyone had strong expressions of frustration with the status quo
>and most had tried to resolve their issues but had their patches rejected.  I
>am not taking either side and whether those rejections were 
>justified I cannot
>say, but the general feeling of their concerns intentionally not being
>addressed isn't healthy.  Several had abandoned setuptools, deeming it a
>failed solution and others called for a fork.
>
>To start, I am not a leader of the group nor do I claim I accurately captured
>and expressed all their concerns.  I apologize to those in the BoF for any
>misrepresentations.

I'm actually happy to hear that there's this much energy available -- 
hopefully some of it can be harnessed towards positive solutions.

When I began developing setuptools, I often asked for the input of 
packagers, developers, etc., through the distutils-sig...  and was 
met with overwhelming silence.  So the fact that there is now a group 
of people who are ready to work for some solutions seems like a 
positive change, to me.

It's hard to make design decisions regarding itches you don't 
personally have, and which other people won't help 
scratch.  Unfortunately, a lot of the proposals from packaging system 
people have been of the form of, "fix this for us by breaking things 
for other people".  Not all of them, though.  Many have been very 
helpful, contributing troubleshooting help and good patches.

That some of those good patches took nearly a year to get into 
setuptools (some from Fedora just got into 0.6c8 that were sent to me 
almost a year ago) is because I'm the only person reviewing 
setuptools patches, and I've spent only a few days in the last year 
doing focused development work on setuptools (as opposed to answering 
questions about it  on the SIG).

It's never a good thing when people's patches sit around, regardless 
of where they come from.  But that's not the same thing as 
*rejecting* the patches.


>1. Many felt the existing dependency resolver was not correct.  They wanted a
>     full tree traversal resulting in an intersection of all restrictions,
>     instead of a first-acceptable-solution approach taking now, which can
>     result in top-level dependencies not being enforced upon lower 
> levels.  The
>     latter is faster however.  One solution would be to make the resolver
>     pluggable.

Patches welcome, on both counts.  Personally, Bob and I originally 
wanted a full-tree intersection, too, but it turned out to be hairier 
to implement than it seems at first.  My guess is that none of the 
people who want it, have actually tried to implement it without a 
factorial or exponential O().  But that doesn't mean I'll be unhappy 
if somebody succeeds.  :)

Intuitively, it seems easy, just gather the requirements and 
intersect.  In practice, different versions of a package may have 
different dependencies, so the intersection is not nearly as simple 
as that.  We ended up just going for a depth-first version of the 
current algorithm (switched to breadth-first later, after field tests 
showed some problems with that), being greedy by testing 
latest-version-first, on the assumption that more recent versions 
would be likely to have the most-restrictive version requirements.

In other words, we attempt to achieve heuristically what's being 
proposed to do algorithmically.  And my guess is that whatever cases 
the heuristic is failing at, would probably not be helped by an 
algorithmic approach either.  But I would welcome some actual data, either way.

Again, though, patches are welcome.  :)  (Specifically, for the 
trunk; I don't see a resolver overhaul as being suitable for the 0.6 
stable branch.)


>2. People want a solution for the handling of documentation.  The distutils
>     module has had commented out sections related to this for several years.

As with so many other things, this gets tossed around the 
distutils-sig every now and then.  A couple of times I've thrown out 
some options for how this might be done, but then the conversation 
peters out around the time anybody would have to actually do some 
work on it.  (Me included, since I don't have an itch that needs 
scratching in this area.)

In particular, if somebody wants to come up with a metadata standard 
for including documentation in eggs, we've got a boatload of hooks by 
which it could be done.  Nothing's stopping anybody from proposing a 
standard and building a tool, here.  (e.g. using the setuptools 
command hook, .egg-info writer hook, etc.)


>3. A more flexible internal handing of the different types of files is needed.
>     Currently the code, data, lib, etc. files are aggregated at 
> build time and
>     people would like them to be kept separate until install/packaging time.

I don't know what this means, exactly.


>     They also want greater flexibility in the kinds of files identified for
>     packaging.  There is currently a single plugin entrypoint for 
> file_finding,
>     so people have resorted to abusing the setuptools function 
> find_packages()
>     again and again with different include/exclude args.  A solution is to
>     expand the set of entrypoints into finer grained categories.  They also
>     want a way to expand the set of categories rather than a fixed set, which
>     can be easily done with entrypoint groups and names.
>
>     People also want a greater variety of file_finders to be included with
>     setuptools.  Instead of just CVS and SVN, they want it to comprehend
>     Mercurial, Bazaar, Git and so forth.

Did you point them to the Cheeseshop?  There are plugins already 
available for all the systems you mentioned, plus Darcs and 
Monotone.  If you mean "included" as in "bundled", this doesn't make 
a whole lot of sense to me.  I'd think that if you're using 
setuptools as a developer (the only reason you need the file finders, 
since source distributions include a prebuilt manifest), you'd not 
have a problem saying "easy_install setuptools-git" or adding a 
"setup_requires='setuptools-git'" line to your setup.py.  (Although 
the latter would only be needed for *development*, not deployment.)

If you mean support for *installing* from these tools, I really 
wanted to add a pluggable download/retrieval mechanism for 
easy_install in 0.7, and would still love to see it happen.


>4. They want an uninstall setuptools command.  Adding one to remove a specific
>     egg isn't difficult but correctly removing those dependencies 
> that came in
>     with that egg, without breaking later installs can be tricky.

Patches, once again, are welcome.  :)  (Btw, "setup.py develop" 
supports uninstallation, although it doesn't do a blessed thing about 
dependencies.)

By the way, there are also third-party tools on the Cheeseshop that 
show egg dependency graphs (e.g. tl.eggdeps) or dump out information 
about installed eggs (e.g. "yolk").


>     This is complicated because there isn't a single global package namespace
>     to manage, when you factor in virtualenv and buildout sandboxes and
>     per-user package areas.  This differs from how RPMs and .debs are viewed.

Yep.  I really wanted, for 0.7, to study virtualenv and buildout and 
try to design a more general mechanism for managing things with the 
vaporware I've been calling "nest".  Unfortunately, I've lost both 
the will and the budget for working on that any time soon.


>5. There was concern over the .pth mechanism used by setuptools re activation.
>     First, there is a (perceived) performance issue with increasingly adding
>     every ZIP file explicitly onto sys.path.  This may or may not be a red
>     herring.

It is.  My tests a few years back, when MAL first brought this up on 
the distutils-sig, showed the startup cost to be positively 
miniscule, if actual zipfiles are used.  At the same time, I myself 
have come to the conclusion that if I had it to do over, I would use 
something more like the .egg-info egg style for general package 
installation, and added an installation manifest to it.  If I ever 
write "nest", it will use such, with the ability to also support .egg 
files and directories.

.egg files were created for extensible application platforms like 
Chandler and Zope and Trac and so on.  Plugins usually need 
libraries, though, so the rest got added on because it was useful, 
and then the whole thing escaped its niche like a foreign organism 
added to an ecosystem with no natural predators.  :)


>     The other is the use of a single .pth file to control the list 
> of activated
>     packages.  Those who produce distributions would prefer a magic directory
>     into which links to distributions could be dropped, similar to 
> the current
>     best practices for Linux, with /etc/conf.d/, /etc/profile.d/,
>     /etc/xinetd.d/ and so forth.

site-packages is that directory, and has been since long before 
setuptools.  Just drop uniquely-named .pth files there, and you're good to go.


>6. There is a need for more extensibility hooks.  People want places to plug
>     in special handling.  For example:
>
>     a) setuptools has a --record option to capture the list of 
> files installed
>        for use by subsequent packaging tools.  Some want that list to be
>        available to a setuptools plugin.
>
>     b) some want hooks for post-build/post-install actions, instead of the
>        current approach of writing a custom build class that handles it all.

Patches welcome!


>7. Many wanted to ability to install files anywhere in the install tree and
>     not just under the Python package.  Under distutils this was possible but
>     it was removed in setuptools for security reasons.

It wasn't security, it was manageability.  Egg-based installation 
means containment, (analagous to GNU stow) and therefore portability 
and disposability of plugins.  (Which again is what eggs were really 
developed for in the first place.)


>   Custom code can still
>     be written to do this explicitly but this is not popular.

No kidding.  :)  Current best practice is to include a script or 
module in the package that can install other files to a designated 
location.  Personally, though, I tend to view applications and 
libraries that target specific install locations to be overreaching 
their bounds, and stepping into sysadmin territory.  Give me the 
tools to install the data, don't just dump it somewhere on my system 
where *you* think it should go, in other words.

On the other hand, I've been puzzling over how to handle legitimate 
post-install features.  On Windows, both wx and pywin32 have a real 
need to do some actuall "install" operations.  Some is just copying 
files, but pywin32 also has to do some registry stuff.  I don't know 
how to allow just what's sensible, without opening up a huge can of 
worms, though.

Proposals welcome.



More information about the Distutils-SIG mailing list