[Distutils] [Python-Dev] Capsule Summary of Some Packaging/Deployment Technology Concerns
Phillip J. Eby
pje at telecommunity.com
Tue Mar 18 01:37:30 CET 2008
At 05:10 PM 3/17/2008 -0500, Jeff Rush wrote:
>I was in a Packaging BoF yesterday and, although not very relevant to the
>packager bootstrap thread, Guido has asked me to post some of the concerns.
>
>The BoF drew about 15 people, many of whom were packagers for Red Hat, Ubuntu
>and such. Everyone had strong expressions of frustration with the status quo
>and most had tried to resolve their issues but had their patches rejected. I
>am not taking either side and whether those rejections were
>justified I cannot
>say, but the general feeling of their concerns intentionally not being
>addressed isn't healthy. Several had abandoned setuptools, deeming it a
>failed solution and others called for a fork.
>
>To start, I am not a leader of the group nor do I claim I accurately captured
>and expressed all their concerns. I apologize to those in the BoF for any
>misrepresentations.
I'm actually happy to hear that there's this much energy available --
hopefully some of it can be harnessed towards positive solutions.
When I began developing setuptools, I often asked for the input of
packagers, developers, etc., through the distutils-sig... and was
met with overwhelming silence. So the fact that there is now a group
of people who are ready to work for some solutions seems like a
positive change, to me.
It's hard to make design decisions regarding itches you don't
personally have, and which other people won't help
scratch. Unfortunately, a lot of the proposals from packaging system
people have been of the form of, "fix this for us by breaking things
for other people". Not all of them, though. Many have been very
helpful, contributing troubleshooting help and good patches.
That some of those good patches took nearly a year to get into
setuptools (some from Fedora just got into 0.6c8 that were sent to me
almost a year ago) is because I'm the only person reviewing
setuptools patches, and I've spent only a few days in the last year
doing focused development work on setuptools (as opposed to answering
questions about it on the SIG).
It's never a good thing when people's patches sit around, regardless
of where they come from. But that's not the same thing as
*rejecting* the patches.
>1. Many felt the existing dependency resolver was not correct. They wanted a
> full tree traversal resulting in an intersection of all restrictions,
> instead of a first-acceptable-solution approach taking now, which can
> result in top-level dependencies not being enforced upon lower
> levels. The
> latter is faster however. One solution would be to make the resolver
> pluggable.
Patches welcome, on both counts. Personally, Bob and I originally
wanted a full-tree intersection, too, but it turned out to be hairier
to implement than it seems at first. My guess is that none of the
people who want it, have actually tried to implement it without a
factorial or exponential O(). But that doesn't mean I'll be unhappy
if somebody succeeds. :)
Intuitively, it seems easy, just gather the requirements and
intersect. In practice, different versions of a package may have
different dependencies, so the intersection is not nearly as simple
as that. We ended up just going for a depth-first version of the
current algorithm (switched to breadth-first later, after field tests
showed some problems with that), being greedy by testing
latest-version-first, on the assumption that more recent versions
would be likely to have the most-restrictive version requirements.
In other words, we attempt to achieve heuristically what's being
proposed to do algorithmically. And my guess is that whatever cases
the heuristic is failing at, would probably not be helped by an
algorithmic approach either. But I would welcome some actual data, either way.
Again, though, patches are welcome. :) (Specifically, for the
trunk; I don't see a resolver overhaul as being suitable for the 0.6
stable branch.)
>2. People want a solution for the handling of documentation. The distutils
> module has had commented out sections related to this for several years.
As with so many other things, this gets tossed around the
distutils-sig every now and then. A couple of times I've thrown out
some options for how this might be done, but then the conversation
peters out around the time anybody would have to actually do some
work on it. (Me included, since I don't have an itch that needs
scratching in this area.)
In particular, if somebody wants to come up with a metadata standard
for including documentation in eggs, we've got a boatload of hooks by
which it could be done. Nothing's stopping anybody from proposing a
standard and building a tool, here. (e.g. using the setuptools
command hook, .egg-info writer hook, etc.)
>3. A more flexible internal handing of the different types of files is needed.
> Currently the code, data, lib, etc. files are aggregated at
> build time and
> people would like them to be kept separate until install/packaging time.
I don't know what this means, exactly.
> They also want greater flexibility in the kinds of files identified for
> packaging. There is currently a single plugin entrypoint for
> file_finding,
> so people have resorted to abusing the setuptools function
> find_packages()
> again and again with different include/exclude args. A solution is to
> expand the set of entrypoints into finer grained categories. They also
> want a way to expand the set of categories rather than a fixed set, which
> can be easily done with entrypoint groups and names.
>
> People also want a greater variety of file_finders to be included with
> setuptools. Instead of just CVS and SVN, they want it to comprehend
> Mercurial, Bazaar, Git and so forth.
Did you point them to the Cheeseshop? There are plugins already
available for all the systems you mentioned, plus Darcs and
Monotone. If you mean "included" as in "bundled", this doesn't make
a whole lot of sense to me. I'd think that if you're using
setuptools as a developer (the only reason you need the file finders,
since source distributions include a prebuilt manifest), you'd not
have a problem saying "easy_install setuptools-git" or adding a
"setup_requires='setuptools-git'" line to your setup.py. (Although
the latter would only be needed for *development*, not deployment.)
If you mean support for *installing* from these tools, I really
wanted to add a pluggable download/retrieval mechanism for
easy_install in 0.7, and would still love to see it happen.
>4. They want an uninstall setuptools command. Adding one to remove a specific
> egg isn't difficult but correctly removing those dependencies
> that came in
> with that egg, without breaking later installs can be tricky.
Patches, once again, are welcome. :) (Btw, "setup.py develop"
supports uninstallation, although it doesn't do a blessed thing about
dependencies.)
By the way, there are also third-party tools on the Cheeseshop that
show egg dependency graphs (e.g. tl.eggdeps) or dump out information
about installed eggs (e.g. "yolk").
> This is complicated because there isn't a single global package namespace
> to manage, when you factor in virtualenv and buildout sandboxes and
> per-user package areas. This differs from how RPMs and .debs are viewed.
Yep. I really wanted, for 0.7, to study virtualenv and buildout and
try to design a more general mechanism for managing things with the
vaporware I've been calling "nest". Unfortunately, I've lost both
the will and the budget for working on that any time soon.
>5. There was concern over the .pth mechanism used by setuptools re activation.
> First, there is a (perceived) performance issue with increasingly adding
> every ZIP file explicitly onto sys.path. This may or may not be a red
> herring.
It is. My tests a few years back, when MAL first brought this up on
the distutils-sig, showed the startup cost to be positively
miniscule, if actual zipfiles are used. At the same time, I myself
have come to the conclusion that if I had it to do over, I would use
something more like the .egg-info egg style for general package
installation, and added an installation manifest to it. If I ever
write "nest", it will use such, with the ability to also support .egg
files and directories.
.egg files were created for extensible application platforms like
Chandler and Zope and Trac and so on. Plugins usually need
libraries, though, so the rest got added on because it was useful,
and then the whole thing escaped its niche like a foreign organism
added to an ecosystem with no natural predators. :)
> The other is the use of a single .pth file to control the list
> of activated
> packages. Those who produce distributions would prefer a magic directory
> into which links to distributions could be dropped, similar to
> the current
> best practices for Linux, with /etc/conf.d/, /etc/profile.d/,
> /etc/xinetd.d/ and so forth.
site-packages is that directory, and has been since long before
setuptools. Just drop uniquely-named .pth files there, and you're good to go.
>6. There is a need for more extensibility hooks. People want places to plug
> in special handling. For example:
>
> a) setuptools has a --record option to capture the list of
> files installed
> for use by subsequent packaging tools. Some want that list to be
> available to a setuptools plugin.
>
> b) some want hooks for post-build/post-install actions, instead of the
> current approach of writing a custom build class that handles it all.
Patches welcome!
>7. Many wanted to ability to install files anywhere in the install tree and
> not just under the Python package. Under distutils this was possible but
> it was removed in setuptools for security reasons.
It wasn't security, it was manageability. Egg-based installation
means containment, (analagous to GNU stow) and therefore portability
and disposability of plugins. (Which again is what eggs were really
developed for in the first place.)
> Custom code can still
> be written to do this explicitly but this is not popular.
No kidding. :) Current best practice is to include a script or
module in the package that can install other files to a designated
location. Personally, though, I tend to view applications and
libraries that target specific install locations to be overreaching
their bounds, and stepping into sysadmin territory. Give me the
tools to install the data, don't just dump it somewhere on my system
where *you* think it should go, in other words.
On the other hand, I've been puzzling over how to handle legitimate
post-install features. On Windows, both wx and pywin32 have a real
need to do some actuall "install" operations. Some is just copying
files, but pywin32 also has to do some registry stuff. I don't know
how to allow just what's sensible, without opening up a huge can of
worms, though.
Proposals welcome.
More information about the Distutils-SIG
mailing list