
At 05:10 PM 3/17/2008 -0500, Jeff Rush wrote:
I was in a Packaging BoF yesterday and, although not very relevant to the packager bootstrap thread, Guido has asked me to post some of the concerns.
The BoF drew about 15 people, many of whom were packagers for Red Hat, Ubuntu and such. Everyone had strong expressions of frustration with the status quo and most had tried to resolve their issues but had their patches rejected. I am not taking either side and whether those rejections were justified I cannot say, but the general feeling of their concerns intentionally not being addressed isn't healthy. Several had abandoned setuptools, deeming it a failed solution and others called for a fork.
To start, I am not a leader of the group nor do I claim I accurately captured and expressed all their concerns. I apologize to those in the BoF for any misrepresentations.
I'm actually happy to hear that there's this much energy available -- hopefully some of it can be harnessed towards positive solutions. When I began developing setuptools, I often asked for the input of packagers, developers, etc., through the distutils-sig... and was met with overwhelming silence. So the fact that there is now a group of people who are ready to work for some solutions seems like a positive change, to me. It's hard to make design decisions regarding itches you don't personally have, and which other people won't help scratch. Unfortunately, a lot of the proposals from packaging system people have been of the form of, "fix this for us by breaking things for other people". Not all of them, though. Many have been very helpful, contributing troubleshooting help and good patches. That some of those good patches took nearly a year to get into setuptools (some from Fedora just got into 0.6c8 that were sent to me almost a year ago) is because I'm the only person reviewing setuptools patches, and I've spent only a few days in the last year doing focused development work on setuptools (as opposed to answering questions about it on the SIG). It's never a good thing when people's patches sit around, regardless of where they come from. But that's not the same thing as *rejecting* the patches.
1. Many felt the existing dependency resolver was not correct. They wanted a full tree traversal resulting in an intersection of all restrictions, instead of a first-acceptable-solution approach taking now, which can result in top-level dependencies not being enforced upon lower levels. The latter is faster however. One solution would be to make the resolver pluggable.
Patches welcome, on both counts. Personally, Bob and I originally wanted a full-tree intersection, too, but it turned out to be hairier to implement than it seems at first. My guess is that none of the people who want it, have actually tried to implement it without a factorial or exponential O(). But that doesn't mean I'll be unhappy if somebody succeeds. :) Intuitively, it seems easy, just gather the requirements and intersect. In practice, different versions of a package may have different dependencies, so the intersection is not nearly as simple as that. We ended up just going for a depth-first version of the current algorithm (switched to breadth-first later, after field tests showed some problems with that), being greedy by testing latest-version-first, on the assumption that more recent versions would be likely to have the most-restrictive version requirements. In other words, we attempt to achieve heuristically what's being proposed to do algorithmically. And my guess is that whatever cases the heuristic is failing at, would probably not be helped by an algorithmic approach either. But I would welcome some actual data, either way. Again, though, patches are welcome. :) (Specifically, for the trunk; I don't see a resolver overhaul as being suitable for the 0.6 stable branch.)
2. People want a solution for the handling of documentation. The distutils module has had commented out sections related to this for several years.
As with so many other things, this gets tossed around the distutils-sig every now and then. A couple of times I've thrown out some options for how this might be done, but then the conversation peters out around the time anybody would have to actually do some work on it. (Me included, since I don't have an itch that needs scratching in this area.) In particular, if somebody wants to come up with a metadata standard for including documentation in eggs, we've got a boatload of hooks by which it could be done. Nothing's stopping anybody from proposing a standard and building a tool, here. (e.g. using the setuptools command hook, .egg-info writer hook, etc.)
3. A more flexible internal handing of the different types of files is needed. Currently the code, data, lib, etc. files are aggregated at build time and people would like them to be kept separate until install/packaging time.
I don't know what this means, exactly.
They also want greater flexibility in the kinds of files identified for packaging. There is currently a single plugin entrypoint for file_finding, so people have resorted to abusing the setuptools function find_packages() again and again with different include/exclude args. A solution is to expand the set of entrypoints into finer grained categories. They also want a way to expand the set of categories rather than a fixed set, which can be easily done with entrypoint groups and names.
People also want a greater variety of file_finders to be included with setuptools. Instead of just CVS and SVN, they want it to comprehend Mercurial, Bazaar, Git and so forth.
Did you point them to the Cheeseshop? There are plugins already available for all the systems you mentioned, plus Darcs and Monotone. If you mean "included" as in "bundled", this doesn't make a whole lot of sense to me. I'd think that if you're using setuptools as a developer (the only reason you need the file finders, since source distributions include a prebuilt manifest), you'd not have a problem saying "easy_install setuptools-git" or adding a "setup_requires='setuptools-git'" line to your setup.py. (Although the latter would only be needed for *development*, not deployment.) If you mean support for *installing* from these tools, I really wanted to add a pluggable download/retrieval mechanism for easy_install in 0.7, and would still love to see it happen.
4. They want an uninstall setuptools command. Adding one to remove a specific egg isn't difficult but correctly removing those dependencies that came in with that egg, without breaking later installs can be tricky.
Patches, once again, are welcome. :) (Btw, "setup.py develop" supports uninstallation, although it doesn't do a blessed thing about dependencies.) By the way, there are also third-party tools on the Cheeseshop that show egg dependency graphs (e.g. tl.eggdeps) or dump out information about installed eggs (e.g. "yolk").
This is complicated because there isn't a single global package namespace to manage, when you factor in virtualenv and buildout sandboxes and per-user package areas. This differs from how RPMs and .debs are viewed.
Yep. I really wanted, for 0.7, to study virtualenv and buildout and try to design a more general mechanism for managing things with the vaporware I've been calling "nest". Unfortunately, I've lost both the will and the budget for working on that any time soon.
5. There was concern over the .pth mechanism used by setuptools re activation. First, there is a (perceived) performance issue with increasingly adding every ZIP file explicitly onto sys.path. This may or may not be a red herring.
It is. My tests a few years back, when MAL first brought this up on the distutils-sig, showed the startup cost to be positively miniscule, if actual zipfiles are used. At the same time, I myself have come to the conclusion that if I had it to do over, I would use something more like the .egg-info egg style for general package installation, and added an installation manifest to it. If I ever write "nest", it will use such, with the ability to also support .egg files and directories. .egg files were created for extensible application platforms like Chandler and Zope and Trac and so on. Plugins usually need libraries, though, so the rest got added on because it was useful, and then the whole thing escaped its niche like a foreign organism added to an ecosystem with no natural predators. :)
The other is the use of a single .pth file to control the list of activated packages. Those who produce distributions would prefer a magic directory into which links to distributions could be dropped, similar to the current best practices for Linux, with /etc/conf.d/, /etc/profile.d/, /etc/xinetd.d/ and so forth.
site-packages is that directory, and has been since long before setuptools. Just drop uniquely-named .pth files there, and you're good to go.
6. There is a need for more extensibility hooks. People want places to plug in special handling. For example:
a) setuptools has a --record option to capture the list of files installed for use by subsequent packaging tools. Some want that list to be available to a setuptools plugin.
b) some want hooks for post-build/post-install actions, instead of the current approach of writing a custom build class that handles it all.
Patches welcome!
7. Many wanted to ability to install files anywhere in the install tree and not just under the Python package. Under distutils this was possible but it was removed in setuptools for security reasons.
It wasn't security, it was manageability. Egg-based installation means containment, (analagous to GNU stow) and therefore portability and disposability of plugins. (Which again is what eggs were really developed for in the first place.)
Custom code can still be written to do this explicitly but this is not popular.
No kidding. :) Current best practice is to include a script or module in the package that can install other files to a designated location. Personally, though, I tend to view applications and libraries that target specific install locations to be overreaching their bounds, and stepping into sysadmin territory. Give me the tools to install the data, don't just dump it somewhere on my system where *you* think it should go, in other words. On the other hand, I've been puzzling over how to handle legitimate post-install features. On Windows, both wx and pywin32 have a real need to do some actuall "install" operations. Some is just copying files, but pywin32 also has to do some registry stuff. I don't know how to allow just what's sensible, without opening up a huge can of worms, though. Proposals welcome.