Capsule Summary of Some Packaging/Deployment Technology Concerns

I was in a Packaging BoF yesterday and, although not very relevant to the packager bootstrap thread, Guido has asked me to post some of the concerns. The BoF drew about 15 people, many of whom were packagers for Red Hat, Ubuntu and such. Everyone had strong expressions of frustration with the status quo and most had tried to resolve their issues but had their patches rejected. I am not taking either side and whether those rejections were justified I cannot say, but the general feeling of their concerns intentionally not being addressed isn't healthy. Several had abandoned setuptools, deeming it a failed solution and others called for a fork. To start, I am not a leader of the group nor do I claim I accurately captured and expressed all their concerns. I apologize to those in the BoF for any misrepresentations. 1. Many felt the existing dependency resolver was not correct. They wanted a full tree traversal resulting in an intersection of all restrictions, instead of a first-acceptable-solution approach taking now, which can result in top-level dependencies not being enforced upon lower levels. The latter is faster however. One solution would be to make the resolver pluggable. 2. People want a solution for the handling of documentation. The distutils module has had commented out sections related to this for several years. 3. A more flexible internal handing of the different types of files is needed. Currently the code, data, lib, etc. files are aggregated at build time and people would like them to be kept separate until install/packaging time. They also want greater flexibility in the kinds of files identified for packaging. There is currently a single plugin entrypoint for file_finding, so people have resorted to abusing the setuptools function find_packages() again and again with different include/exclude args. A solution is to expand the set of entrypoints into finer grained categories. They also want a way to expand the set of categories rather than a fixed set, which can be easily done with entrypoint groups and names. People also want a greater variety of file_finders to be included with setuptools. Instead of just CVS and SVN, they want it to comprehend Mercurial, Bazaar, Git and so forth. 4. They want an uninstall setuptools command. Adding one to remove a specific egg isn't difficult but correctly removing those dependencies that came in with that egg, without breaking later installs can be tricky. This is complicated because there isn't a single global package namespace to manage, when you factor in virtualenv and buildout sandboxes and per-user package areas. This differs from how RPMs and .debs are viewed. 5. There was concern over the .pth mechanism used by setuptools re activation. First, there is a (perceived) performance issue with increasingly adding every ZIP file explicitly onto sys.path. This may or may not be a red herring. The other is the use of a single .pth file to control the list of activated packages. Those who produce distributions would prefer a magic directory into which links to distributions could be dropped, similar to the current best practices for Linux, with /etc/conf.d/, /etc/profile.d/, /etc/xinetd.d/ and so forth. 6. There is a need for more extensibility hooks. People want places to plug in special handling. For example: a) setuptools has a --record option to capture the list of files installed for use by subsequent packaging tools. Some want that list to be available to a setuptools plugin. b) some want hooks for post-build/post-install actions, instead of the current approach of writing a custom build class that handles it all. 7. Many wanted to ability to install files anywhere in the install tree and not just under the Python package. Under distutils this was possible but it was removed in setuptools for security reasons. Custom code can still be written to do this explicitly but this is not popular. Neither setuptools nor distutils has the ability to rename files at install time. A fair question is whether it is the job of setuptools (or any Python packaging solution) to cover all these bases. The risk of not doing the job is that some of those in attendance were rolling their own solutions which do not play well with packages installed using other means, not seeing them. Distutils has intentionally tried to -not- be a general replacement packaging solution, with its support of the "bdist" command for various platform-specific distribution formats. We should continue not trying to replace platform-specific packaging technologies but perhaps improve our control of their creation. As mentioned, some of these concerns can be resolved by adding customization-pressure-release entrypoints to setuptools, and some can be handled with much better documentation of use cases and what to do. And some of it is confusion over packaging libraries versus applications, where setuptools focuses on the former and zc.buildout focuses on the latter. But buildout is very young, maintains isolation from the system Python and was not known to many of the packaging BoF attendees. Some of this may seem down on eggs, but I think they are really cool and would like to see them adopted as the standard for packaging Python software. There are rough spots on setuptools and buildout that would benefit from opening up the process and bringing in more developers, and communicating what they are and more importantly, what they are not. I believe the lack of a coherent packaging and deployment story for Python is hurting its uptake in many sectors and would like to work with others to strengthen this area of Python. -Jeff

At 05:10 PM 3/17/2008 -0500, Jeff Rush wrote:
I was in a Packaging BoF yesterday and, although not very relevant to the packager bootstrap thread, Guido has asked me to post some of the concerns.
The BoF drew about 15 people, many of whom were packagers for Red Hat, Ubuntu and such. Everyone had strong expressions of frustration with the status quo and most had tried to resolve their issues but had their patches rejected. I am not taking either side and whether those rejections were justified I cannot say, but the general feeling of their concerns intentionally not being addressed isn't healthy. Several had abandoned setuptools, deeming it a failed solution and others called for a fork.
To start, I am not a leader of the group nor do I claim I accurately captured and expressed all their concerns. I apologize to those in the BoF for any misrepresentations.
I'm actually happy to hear that there's this much energy available -- hopefully some of it can be harnessed towards positive solutions. When I began developing setuptools, I often asked for the input of packagers, developers, etc., through the distutils-sig... and was met with overwhelming silence. So the fact that there is now a group of people who are ready to work for some solutions seems like a positive change, to me. It's hard to make design decisions regarding itches you don't personally have, and which other people won't help scratch. Unfortunately, a lot of the proposals from packaging system people have been of the form of, "fix this for us by breaking things for other people". Not all of them, though. Many have been very helpful, contributing troubleshooting help and good patches. That some of those good patches took nearly a year to get into setuptools (some from Fedora just got into 0.6c8 that were sent to me almost a year ago) is because I'm the only person reviewing setuptools patches, and I've spent only a few days in the last year doing focused development work on setuptools (as opposed to answering questions about it on the SIG). It's never a good thing when people's patches sit around, regardless of where they come from. But that's not the same thing as *rejecting* the patches.
1. Many felt the existing dependency resolver was not correct. They wanted a full tree traversal resulting in an intersection of all restrictions, instead of a first-acceptable-solution approach taking now, which can result in top-level dependencies not being enforced upon lower levels. The latter is faster however. One solution would be to make the resolver pluggable.
Patches welcome, on both counts. Personally, Bob and I originally wanted a full-tree intersection, too, but it turned out to be hairier to implement than it seems at first. My guess is that none of the people who want it, have actually tried to implement it without a factorial or exponential O(). But that doesn't mean I'll be unhappy if somebody succeeds. :) Intuitively, it seems easy, just gather the requirements and intersect. In practice, different versions of a package may have different dependencies, so the intersection is not nearly as simple as that. We ended up just going for a depth-first version of the current algorithm (switched to breadth-first later, after field tests showed some problems with that), being greedy by testing latest-version-first, on the assumption that more recent versions would be likely to have the most-restrictive version requirements. In other words, we attempt to achieve heuristically what's being proposed to do algorithmically. And my guess is that whatever cases the heuristic is failing at, would probably not be helped by an algorithmic approach either. But I would welcome some actual data, either way. Again, though, patches are welcome. :) (Specifically, for the trunk; I don't see a resolver overhaul as being suitable for the 0.6 stable branch.)
2. People want a solution for the handling of documentation. The distutils module has had commented out sections related to this for several years.
As with so many other things, this gets tossed around the distutils-sig every now and then. A couple of times I've thrown out some options for how this might be done, but then the conversation peters out around the time anybody would have to actually do some work on it. (Me included, since I don't have an itch that needs scratching in this area.) In particular, if somebody wants to come up with a metadata standard for including documentation in eggs, we've got a boatload of hooks by which it could be done. Nothing's stopping anybody from proposing a standard and building a tool, here. (e.g. using the setuptools command hook, .egg-info writer hook, etc.)
3. A more flexible internal handing of the different types of files is needed. Currently the code, data, lib, etc. files are aggregated at build time and people would like them to be kept separate until install/packaging time.
I don't know what this means, exactly.
They also want greater flexibility in the kinds of files identified for packaging. There is currently a single plugin entrypoint for file_finding, so people have resorted to abusing the setuptools function find_packages() again and again with different include/exclude args. A solution is to expand the set of entrypoints into finer grained categories. They also want a way to expand the set of categories rather than a fixed set, which can be easily done with entrypoint groups and names.
People also want a greater variety of file_finders to be included with setuptools. Instead of just CVS and SVN, they want it to comprehend Mercurial, Bazaar, Git and so forth.
Did you point them to the Cheeseshop? There are plugins already available for all the systems you mentioned, plus Darcs and Monotone. If you mean "included" as in "bundled", this doesn't make a whole lot of sense to me. I'd think that if you're using setuptools as a developer (the only reason you need the file finders, since source distributions include a prebuilt manifest), you'd not have a problem saying "easy_install setuptools-git" or adding a "setup_requires='setuptools-git'" line to your setup.py. (Although the latter would only be needed for *development*, not deployment.) If you mean support for *installing* from these tools, I really wanted to add a pluggable download/retrieval mechanism for easy_install in 0.7, and would still love to see it happen.
4. They want an uninstall setuptools command. Adding one to remove a specific egg isn't difficult but correctly removing those dependencies that came in with that egg, without breaking later installs can be tricky.
Patches, once again, are welcome. :) (Btw, "setup.py develop" supports uninstallation, although it doesn't do a blessed thing about dependencies.) By the way, there are also third-party tools on the Cheeseshop that show egg dependency graphs (e.g. tl.eggdeps) or dump out information about installed eggs (e.g. "yolk").
This is complicated because there isn't a single global package namespace to manage, when you factor in virtualenv and buildout sandboxes and per-user package areas. This differs from how RPMs and .debs are viewed.
Yep. I really wanted, for 0.7, to study virtualenv and buildout and try to design a more general mechanism for managing things with the vaporware I've been calling "nest". Unfortunately, I've lost both the will and the budget for working on that any time soon.
5. There was concern over the .pth mechanism used by setuptools re activation. First, there is a (perceived) performance issue with increasingly adding every ZIP file explicitly onto sys.path. This may or may not be a red herring.
It is. My tests a few years back, when MAL first brought this up on the distutils-sig, showed the startup cost to be positively miniscule, if actual zipfiles are used. At the same time, I myself have come to the conclusion that if I had it to do over, I would use something more like the .egg-info egg style for general package installation, and added an installation manifest to it. If I ever write "nest", it will use such, with the ability to also support .egg files and directories. .egg files were created for extensible application platforms like Chandler and Zope and Trac and so on. Plugins usually need libraries, though, so the rest got added on because it was useful, and then the whole thing escaped its niche like a foreign organism added to an ecosystem with no natural predators. :)
The other is the use of a single .pth file to control the list of activated packages. Those who produce distributions would prefer a magic directory into which links to distributions could be dropped, similar to the current best practices for Linux, with /etc/conf.d/, /etc/profile.d/, /etc/xinetd.d/ and so forth.
site-packages is that directory, and has been since long before setuptools. Just drop uniquely-named .pth files there, and you're good to go.
6. There is a need for more extensibility hooks. People want places to plug in special handling. For example:
a) setuptools has a --record option to capture the list of files installed for use by subsequent packaging tools. Some want that list to be available to a setuptools plugin.
b) some want hooks for post-build/post-install actions, instead of the current approach of writing a custom build class that handles it all.
Patches welcome!
7. Many wanted to ability to install files anywhere in the install tree and not just under the Python package. Under distutils this was possible but it was removed in setuptools for security reasons.
It wasn't security, it was manageability. Egg-based installation means containment, (analagous to GNU stow) and therefore portability and disposability of plugins. (Which again is what eggs were really developed for in the first place.)
Custom code can still be written to do this explicitly but this is not popular.
No kidding. :) Current best practice is to include a script or module in the package that can install other files to a designated location. Personally, though, I tend to view applications and libraries that target specific install locations to be overreaching their bounds, and stepping into sysadmin territory. Give me the tools to install the data, don't just dump it somewhere on my system where *you* think it should go, in other words. On the other hand, I've been puzzling over how to handle legitimate post-install features. On Windows, both wx and pywin32 have a real need to do some actuall "install" operations. Some is just copying files, but pywin32 also has to do some registry stuff. I don't know how to allow just what's sensible, without opening up a huge can of worms, though. Proposals welcome.

On Mon, Mar 17, 2008 at 08:37:30PM -0400, Phillip J. Eby wrote:
At 05:10 PM 3/17/2008 -0500, Jeff Rush wrote:
People also want a greater variety of file_finders to be included with setuptools. Instead of just CVS and SVN, they want it to comprehend Mercurial, Bazaar, Git and so forth.
Did you point them to the Cheeseshop? There are plugins already available for all the systems you mentioned, plus Darcs and Monotone. If you mean "included" as in "bundled", this doesn't make a whole lot of sense to me.
It does to me. Think "batteries included". Running 'setup.py sdist/bdist_foo' should not silently produce an incorrect package with some files missing just because the person running it did not have some additional plugin installed.
I'd think that if you're using setuptools as a developer (the only reason you need the file finders, since source distributions include a prebuilt manifest), you'd not have a problem saying "easy_install setuptools-git" or adding a "setup_requires='setuptools-git'" line to your setup.py. (Although the latter would only be needed for *development*, not deployment.)
setup_requires looks like a solution, but it requires extra attention from the developers who write the setup.py. Writing a setup.py is already quite complicated -- I usually end up copying an existing one and modifying it. Marius Gedminas -- Most security experts REALLY believe in firewalls. The expect that, when they die, arrive at the great firewall in the sky where Saint Peter is running a default policy of REJECT. --- Sander Plomp

Marius Gedminas wrote:
On Mon, Mar 17, 2008 at 08:37:30PM -0400, Phillip J. Eby wrote:
At 05:10 PM 3/17/2008 -0500, Jeff Rush wrote:
People also want a greater variety of file_finders to be included with setuptools. Instead of just CVS and SVN, they want it to comprehend Mercurial, Bazaar, Git and so forth.
Did you point them to the Cheeseshop? There are plugins already available for all the systems you mentioned, plus Darcs and Monotone. If you mean "included" as in "bundled", this doesn't make a whole lot of sense to me.
They knew there were plugins out there, of various quality and availability but wanted them bundled. ;-) It's a pain to track them down. Perhaps if the RPM format were broken out from setuptools, as the inclusion of some formats leads them to believe the set is just incomplete, not intentionally sparse.
I'd think that if you're using setuptools as a developer (the only reason you need the file finders, since source distributions include a prebuilt manifest), you'd not have a problem saying "easy_install setuptools-git" or adding a "setup_requires='setuptools-git'" line to your setup.py. (Although the latter would only be needed for *development*, not deployment.)
setup_requires looks like a solution, but it requires extra attention from the developers who write the setup.py. Writing a setup.py is already quite complicated -- I usually end up copying an existing one and modifying it.
As a compromise, of making new formats easily available but not bundled, and not requiring special action within setup.py, setuptools could treat --formats=dpkg as an implicit setup_requires= and pull it from PyPI. And the --list-formats option could query PyPI for the possibilities, just as --list-classifiers does today. If would require a few standards in keywording/classifying those format eggs but we already need those standards for other projects, such as locating recipes for buildout and plugins for trac. -Jeff

Phillip J. Eby wrote:
At 05:10 PM 3/17/2008 -0500, Jeff Rush wrote:
1. Many felt the existing dependency resolver was not correct. They wanted a full tree traversal resulting in an intersection of all restrictions, instead of a first-acceptable-solution approach taking now, which can result in top-level dependencies not being enforced upon lower levels. The latter is faster however. One solution would be to make the resolver pluggable.
Patches welcome, on both counts. Personally, Bob and I originally wanted a full-tree intersection, too, but it turned out to be hairier to implement than it seems at first. My guess is that none of the people who want it, have actually tried to implement it without a factorial or exponential O(). But that doesn't mean I'll be unhappy if somebody succeeds. :)
I think we'd make significant progress by just intersecting the dependencies we know about as we progress through the dependency tree. For example, if A requires B==2 and C==3, and if B requires C>=2,<=4, then at the time we install A we'd pick C==3 and also at the time we install B we'd pick C==3. As opposed to the current scheme that would choose C==4 for the latter case. This would allow dependent projects (think applications here) to better control the versions of the full set of libraries they use. Things would still fail (like they do now) if you ran across dependencies that had no intersection or if you encountered a new requirement after the target projected was already installed. If you really wanted to do a full-tree intersection, it seems to me that the problem is detecting all the dependencies without having to spend significant time downloading/building in order to find them out. This could be solved by simply extending the cheeseshop interface to export the set of requirements outside of the egg / tarball / etc. We've done this for our own egg repository by extracting the appropriate meta-data files out of EGG-INFO and putting it into a separate file. This info is also useful for users as it gives them an idea of how much *new* stuff is going to be installed (a la yum, apt-get, etc.)
In other words, we attempt to achieve heuristically what's being proposed to do algorithmically. And my guess is that whatever cases the heuristic is failing at, would probably not be helped by an algorithmic approach either. But I would welcome some actual data, either way.
With our ETS projects, we've run into problems with the current heuristic. Perhaps we just don't know how to make it work like we want? We have a set of projects that we want to be individually installable (to the extent that we limit cross-project dependencies) but we also want to make it easy to install the complete set. We use a meta-egg for the latter. It's purpose is only to specify the exact versions of each project that have been explicitly tested to work together -- you could almost think of it as a source control system tag. Whereas on the individual projects, we explicitly want to ensure that people get the latest possible release of each required API so the version requirements are wider here. This setup causes problems whenever we release new versions of projects because it seems easy_install ignores the meta-egg exact versions when it gets down into a project and comes across a wider cross-project dependency. We ended up having to give up on the ranges in the cross-project dependencies and synchronize them to the same values in the meta-egg dependencies. There are numerous side-effects of this that we don't like but we haven't found a way around it.
Again, though, patches are welcome. :) (Specifically, for the trunk; I don't see a resolver overhaul as being suitable for the 0.6 stable branch.)
We're planning to pursue this (for the above mentioned strategy) as soon as we work ourselves out of a bit of a backlog of other things to do.
2. People want a solution for the handling of documentation. The distutils module has had commented out sections related to this for several years.
As with so many other things, this gets tossed around the distutils-sig every now and then. A couple of times I've thrown out some options for how this might be done, but then the conversation peters out around the time anybody would have to actually do some work on it. (Me included, since I don't have an itch that needs scratching in this area.)
In particular, if somebody wants to come up with a metadata standard for including documentation in eggs, we've got a boatload of hooks by which it could be done. Nothing's stopping anybody from proposing a standard and building a tool, here. (e.g. using the setuptools command hook, .egg-info writer hook, etc.)
Enthought has started an effort (it's currently one of two things in our ETSProjectTools project at https://svn.enthought.com/svn/enthought/ETSProjectTools/trunk) and we're experimenting with our solution before proposing it as a patch. We'd love some more help if anyone wants to participate.
3. A more flexible internal handing of the different types of files is needed. Currently the code, data, lib, etc. files are aggregated at build time and people would like them to be kept separate until install/packaging time.
I don't know what this means, exactly.
A number of projects want to provide various types of files besides code in their distributable, and they'd like these to end up in standard locations for that type of file. Think documentation, sample data, web templates, configuration settings, etc. Each of these should be treated differently at installation time depending on platform. On *nix, docs should go in /usr/share/doc whereas we might need to create a C:\Python2.5\docs on Windows. With sample data and templates, you probably just want it accessible outside of the zipped egg so users can easily look at it, add to it, edit it, etc. Configuration settings should be installed with some defaults into a standard configuration directory like /etc on *nix, etc. Basically the issue is that it needs to be easier to include different sets of files into an egg for different actions to be taken during installation or packaging into an OS-specific distribution format.
The other is the use of a single .pth file to control the list of activated packages. Those who produce distributions would prefer a magic directory into which links to distributions could be dropped, similar to the current best practices for Linux, with /etc/conf.d/, /etc/profile.d/, /etc/xinetd.d/ and so forth.
site-packages is that directory, and has been since long before setuptools. Just drop uniquely-named .pth files there, and you're good to go.
But the docs for easy_install claim that the list of active eggs is maintained in easy-install.pth. Also, if I create my own .pth file, and the user tries to update my version to a new one, will the easy_install tool modify my .pth file to remove the mention of the old version from my sys.path and put the new version in the same .pth file? Or will it now be listed in both places? Or will it only in easy-install.pth?
7. Many wanted to ability to install files anywhere in the install tree and not just under the Python package. Under distutils this was possible but it was removed in setuptools for security reasons.
It wasn't security, it was manageability. Egg-based installation means containment, (analagous to GNU stow) and therefore portability and disposability of plugins. (Which again is what eggs were really developed for in the first place.)
Yes, but as you've already pointed out, they've escaped into a larger ecosystem and this restriction is a severe limitation -- leading to significant frustration. Especially as projects evolve and want to do something more complex than simply install pure Python code. Here at Enthought, we use and ship a number of projects that have extensions and thus dynamic libraries that need to either be modified during installation to work from the user's installed location, or copied elsewhere on the system to avoid the need to modify (which we also can't do via an egg install) env variables, registries, etc. We'd also love to be able to ship end-user enterprise-scale applications via eggs so that bug fixes and updates don't require downloading a monolithic 100MB+ installer. But doing that requires the ability to update desktop icons, menus, etc. which we also can't do automatically via an egg. If you don't want the burden on setuptools to support, much less track, all these options, then perhaps it could just support automatic execution of a post-install script (and pre-uninstall script if uninstallation ever happens) that allows individual project developers to do what they need to do? Let the burden of describing how those things happen and how to uninstall/relocate/update them fall to the provider of the projects that do them. Also, IIUC, stow only tries to "contain" the hard files. It puts links in multiple standard locations (for man pages, executables, libraries, etc.) If setuptools supported these options, I don't think there'd be any discussion here except for things like "how do I extend the set of things the tool supports so that my foo-type files get linked into the standard /os/path/to/foo for the X os?"
Custom code can still be written to do this explicitly but this is not popular.
No kidding. :) Current best practice is to include a script or module in the package that can install other files to a designated location. Personally, though, I tend to view applications and libraries that target specific install locations to be overreaching their bounds, and stepping into sysadmin territory. Give me the tools to install the data, don't just dump it somewhere on my system where *you* think it should go, in other words.
I should have read ahead. This sounds close to what I've been describing except that this leads me to picture a script that prompts for install locations and allows the user to customize the destinations rather than one that assumes everything goes in a standard place. I'm all for this, and the continuation of the ability to install an egg into a user-environment vs. a system-environment. The only thing missing here is the ability for the installer to automatically run that script so that installation isn't a disjointed, two-step manual process that a user is prone to forgot to complete. One of the features of Enthought's Enstaller extension to easy_install was that it looks for a post_install.py script in EGG-INFO and if one is found, runs it. I would think that getting this into setuptools would be a significant step forward but I believe you previously rejected that idea. We'll take a stab at creating a patch for you if you're more receptive to that idea now. Just let me know.
On the other hand, I've been puzzling over how to handle legitimate post-install features. On Windows, both wx and pywin32 have a real need to do some actuall "install" operations. Some is just copying files, but pywin32 also has to do some registry stuff. I don't know how to allow just what's sensible, without opening up a huge can of worms, though.
I think there are lots of situations that are legitimate (projects with extensions, projects that want to put icons on the desktop or in menus, projects that need to interact with a registry, projects that want to put configuration information somewhere other than in a zip file in a site-packages dir, etc.) I think we should worry less about preventing developers from shooting themselves in the foot and more about ensuring that they can hunt for food for their survival. We can always tighten things down after seeing the usecases that develop, right? -- Dave

We should probably move this off of Python-Dev, as we're getting into deep details now... At 07:27 PM 3/18/2008 -0500, Dave Peterson wrote:
If you really wanted to do a full-tree intersection, it seems to me that the problem is detecting all the dependencies without having to spend significant time downloading/building in order to find them out. This could be solved by simply extending the cheeseshop interface to export the set of requirements outside of the egg / tarball / etc. We've done this for our own egg repository by extracting the appropriate meta-data files out of EGG-INFO and putting it into a separate file. This info is also useful for users as it gives them an idea of how much *new* stuff is going to be installed (a la yum, apt-get, etc.)
...and now we're more directly competing with them, too. The original idea Bob and I had was to do XML files ala Eclipse feature repositories, but then later I realized that for what we were doing, HTML was both adequate and already available. However, I don't see a problem in principle with having "header" files available for this sort of thing.
With our ETS projects, we've run into problems with the current heuristic. Perhaps we just don't know how to make it work like we want?
We have a set of projects that we want to be individually installable (to the extent that we limit cross-project dependencies) but we also want to make it easy to install the complete set. We use a meta-egg for the latter. It's purpose is only to specify the exact versions of each project that have been explicitly tested to work together -- you could almost think of it as a source control system tag.
I would think that as long as that meta-egg specifies *all* the required versions (right down to recursive dependencies), then there shouldn't be any problem. Maybe it's me who's not understanding something? I would think that you could get the appropriate data by running the tl.eggdeps tool.
A number of projects want to provide various types of files besides code in their distributable, and they'd like these to end up in standard locations for that type of file. Think documentation, sample data, web templates, configuration settings, etc. Each of these should be treated differently at installation time depending on platform. On *nix, docs should go in /usr/share/doc whereas we might need to create a C:\Python2.5\docs on Windows. With sample data and templates, you probably just want it accessible outside of the zipped egg so users can easily look at it, add to it, edit it, etc. Configuration settings should be installed with some defaults into a standard configuration directory like /etc on *nix, etc.
Basically the issue is that it needs to be easier to include different sets of files into an egg for different actions to be taken during installation or packaging into an OS-specific distribution format.
Yes, it would be nice to define a metadata standard for including installable "datasets" either through copying or symlinking, optionally with entry points for running some code, too. When you install an egg, these things could get added to a "post-install to-do" list, that you could then read to find out what steps to do, or invoke a tool on to actually do some of those steps.
But the docs for easy_install claim that the list of active eggs is maintained in easy-install.pth. Also, if I create my own .pth file, and the user tries to update my version to a new one, will the easy_install tool modify my .pth file to remove the mention of the old version from my sys.path and put the new version in the same .pth file? Or will it now be listed in both places? Or will it only in easy-install.pth?
My understanding of the context of the question was that it applied to *system* packaging tools, which would be exclusively maintaining the .pth entries for the packages they installed. i.e., a scenario with *no* easy-install.pth. Setuptools will still detect the presence of their eggs, regardless of the means by which they're added to sys.path. But it would not *maintain* those .pth files.
Yes, but as you've already pointed out, they've escaped into a larger ecosystem and this restriction is a severe limitation -- leading to significant frustration. Especially as projects evolve and want to do something more complex than simply install pure Python code. Here at Enthought, we use and ship a number of projects that have extensions and thus dynamic libraries that need to either be modified during installation to work from the user's installed location, or copied elsewhere on the system to avoid the need to modify (which we also can't do via an egg install) env variables, registries, etc.
By the way, there *is* experimental shared library building support in setuptools, and I recently heard from Andi Vajda that he was successful in using it in his JCC project to make available a C++ library for linkage from JCC-built projects. (I'm also sitting on his patch that makes it work...) I'm not sure that it actually fixes the larger problem, in that e.g., if the main project is installed by the system, and then you build or install an egg yourself. But I think those problems are solvable.
We'd also love to be able to ship end-user enterprise-scale applications via eggs so that bug fixes and updates don't require downloading a monolithic 100MB+ installer. But doing that requires the ability to update desktop icons, menus, etc. which we also can't do automatically via an egg.
Yep... a good post-install mechanism would be handy for wx and pywin32 as well.
If you don't want the burden on setuptools to support, much less track, all these options, then perhaps it could just support automatic execution of a post-install script (and pre-uninstall script if uninstallation ever happens) that allows individual project developers to do what they need to do? Let the burden of describing how those things happen and how to uninstall/relocate/update them fall to the provider of the projects that do them.
Yeah, that's what I really *don't* want. I'd like to enable a more trustable mechanism than a blindly-executed script. I'd rather see a standard that makes a developer document more, and have to at least *convince* the user that their post-install is worthwhile, even if the tool then makes it easy to run. Better still, I'd rather have those post-install parts done in such a way that things like icons, menus, manifests, registry stuff, etc., have to get explicitly listed instead of being done programatically.
Also, IIUC, stow only tries to "contain" the hard files. It puts links in multiple standard locations (for man pages, executables, libraries, etc.) If setuptools supported these options, I don't think there'd be any discussion here except for things like "how do I extend the set of things the tool supports so that my foo-type files get linked into the standard /os/path/to/foo for the X os?"
Yep. Having that would be a worthwhile thing, I think. Discussion leading to specs is most welcome.
I should have read ahead. This sounds close to what I've been describing except that this leads me to picture a script that prompts for install locations and allows the user to customize the destinations rather than one that assumes everything goes in a standard place. I'm all for this, and the continuation of the ability to install an egg into a user-environment vs. a system-environment.
+1.
The only thing missing here is the ability for the installer to automatically run that script so that installation isn't a disjointed, two-step manual process that a user is prone to forgot to complete.
I don't see a problem with a prompting process, backed by a log file that records what post-install steps are pending, finished, or explicitly rejected by the user. One possibility, by the way, is that we could overload "extras" for this purpose. Entry points (such as those for scripts) can require extras; if extras could mean post-install components like docs or icons or what-have-you, then trying to run the script could result in an error message telling you you need to "easy_install foo_package[icons]" or whatever.
One of the features of Enthought's Enstaller extension to easy_install was that it looks for a post_install.py script in EGG-INFO and if one is found, runs it. I would think that getting this into setuptools would be a significant step forward but I believe you previously rejected that idea. We'll take a stab at creating a patch for you if you're more receptive to that idea now. Just let me know.
No -- I'm not happy with a straight-up executable hook for post-install steps. My evaluation of the state of PyPI is that I don't trust the community to write non-hazardous setup.py files, let alone post-install scripts. There should be a high technical and social barrier to including post-install hooks with arbitrary code. For example, if there was a required separation between installer tools and the things they install, such that any post-install operation had to be performed strictly by providing some human-readable data that will be passed to a separately-installed tool, and there was a high social stigma associated with writing your own post-install tool, then that might work. So, for example, if the community creates an icons and menus installer tool for the various platforms, and then anybody can use it in their project by adding the right data, then the user doesn't have to fully trust arbitrary package authors, only the authors of the post-install tools. I'm not saying that model is perfect; in fact I can see some potential pitfalls. But once an automatic post-install hole is opened it will be *very* hard to close, because it will always be *easier* to roll your own crappy post-installer instead of contributing to a set of robust cross-project/cross-platform tools. So I'd rather keep this particular "itch" in play and try to build up the scratching pressure until some people get together and pay attention long enough to solve the problem in a less hacky way. :)
On the other hand, I've been puzzling over how to handle legitimate post-install features. On Windows, both wx and pywin32 have a real need to do some actuall "install" operations. Some is just copying files, but pywin32 also has to do some registry stuff. I don't know how to allow just what's sensible, without opening up a huge can of worms, though.
I think there are lots of situations that are legitimate (projects with extensions, projects that want to put icons on the desktop or in menus, projects that need to interact with a registry, projects that want to put configuration information somewhere other than in a zip file in a site-packages dir, etc.) I think we should worry less about preventing developers from shooting themselves in the foot
It's the users' feet that I'm concerned with. Some people are already paranoid about the fact that PyPI doesn't use SSL and code signing, or that easy_install uses the intarwebs at all. I can just see the witch hunt when we start executing arbitrary code. Unh unh. No way am I letting that happen. Nope.
and more about ensuring that they can hunt for food for their survival.
Right now, if you have a post-install script that's essential, you'll just have to convince your users to run it. Which nicely keeps easy_install out of what should be a conversation between developer and user. Enstaller is a different case - you are presumably installing an application, and the user is trusting your installer. easy_install is something else altogether, and is used by other programs such as buildout. Actually, I wonder if instead of trying to enhance setuptools for post-install, if maybe we should be looking at buildout recipes and maybe having a way for setuptools dependencies to point to buildout specs. IIRC, buildout specs can be remotely retrieved from a single URL, too.
We can always tighten things down after seeing the usecases that develop, right?
Actually, no, we can't, since backward compatibility would keep us from removing the hook, once people rely on it. I really feel yours (and others) pain on this issue, but it's one place where the users have to come first, and they need protection from the wilds of PyPI. Distribution and installation issues are not first on most developers' minds, so the fact that someone writes a great library on PyPI doesn't mean they can write installers worth a crap. Frankly, I wouldn't trust myself to write a correct post-installer on the first go -- perhaps *because* I have seen so many "simple" things go wrong.

Phillip J. Eby wrote:
We should probably move this off of Python-Dev, as we're getting into deep details now...
Done. Only in distutils-sig now.
At 07:27 PM 3/18/2008 -0500, Dave Peterson wrote:
If you really wanted to do a full-tree intersection, it seems to me that the problem is detecting all the dependencies without having to spend significant time downloading/building in order to find them out. This could be solved by simply extending the cheeseshop interface to export the set of requirements outside of the egg / tarball / etc. We've done this for our own egg repository by extracting the appropriate meta-data files out of EGG-INFO and putting it into a separate file. This info is also useful for users as it gives them an idea of how much *new* stuff is going to be installed (a la yum, apt-get, etc.)
...and now we're more directly competing with them, too. The original idea Bob and I had was to do XML files ala Eclipse feature repositories, but then later I realized that for what we were doing, HTML was both adequate and already available. However, I don't see a problem in principle with having "header" files available for this sort of thing.
It seems from latter discussions that Martin v. Löwis agrees that this is a reasonable thing to do. I'll see if I can find time to work on a patch to PyPI. Not having looked at that code before at all, it might take me awhile.
With our ETS projects, we've run into problems with the current heuristic. Perhaps we just don't know how to make it work like we want?
We have a set of projects that we want to be individually installable (to the extent that we limit cross-project dependencies) but we also want to make it easy to install the complete set. We use a meta-egg for the latter. It's purpose is only to specify the exact versions of each project that have been explicitly tested to work together -- you could almost think of it as a source control system tag.
I would think that as long as that meta-egg specifies *all* the required versions (right down to recursive dependencies), then there shouldn't be any problem. Maybe it's me who's not understanding something?
It actually does specify all the required versions, including those recursive dependencies, but we were still getting breakages when new versions were released. :-( I think I explained what we were seeing in my original e-mail though it sounds like you're saying that shouldn't be possible, right?
I would think that you could get the appropriate data by running the tl.eggdeps tool.
Getting the version data isn't a problem at all for us, but thanks for the pointer to an interesting project. (We have an internal project that actually analyzes import statements within the code within a project to ensure that the documented dependencies in a setup.py match the declared ones and this solves the problem for us.)
A number of projects want to provide various types of files besides code in their distributable, and they'd like these to end up in standard locations for that type of file. Think documentation, sample data, web templates, configuration settings, etc. Each of these should be treated differently at installation time depending on platform. On *nix, docs should go in /usr/share/doc whereas we might need to create a C:\Python2.5\docs on Windows. With sample data and templates, you probably just want it accessible outside of the zipped egg so users can easily look at it, add to it, edit it, etc. Configuration settings should be installed with some defaults into a standard configuration directory like /etc on *nix, etc.
Basically the issue is that it needs to be easier to include different sets of files into an egg for different actions to be taken during installation or packaging into an OS-specific distribution format.
Yes, it would be nice to define a metadata standard for including installable "datasets" either through copying or symlinking, optionally with entry points for running some code, too. When you install an egg, these things could get added to a "post-install to-do" list, that you could then read to find out what steps to do, or invoke a tool on to actually do some of those steps.
I agree. Let's get that setuptools wiki started and start documenting some of these ideas as a roadmap so that anyone who wants to help out has an idea of what to work on, or factor into what they're currently working on.
But the docs for easy_install claim that the list of active eggs is maintained in easy-install.pth. Also, if I create my own .pth file, and the user tries to update my version to a new one, will the easy_install tool modify my .pth file to remove the mention of the old version from my sys.path and put the new version in the same .pth file? Or will it now be listed in both places? Or will it only in easy-install.pth?
My understanding of the context of the question was that it applied to *system* packaging tools, which would be exclusively maintaining the .pth entries for the packages they installed. i.e., a scenario with *no* easy-install.pth. Setuptools will still detect the presence of their eggs, regardless of the means by which they're added to sys.path. But it would not *maintain* those .pth files.
I may be confusing the issue then. I was under the impression that system packaging tools would want to install things such that anyone used to using setuptools would be able to see the effects of that installation in the same way as if it was done via easy_install. i.e. if I wanted to temporarily remove it for testing something or other, I could de-activate it; or if I wanted to install a second optional version of it, I could use easy_install to do so without worrying about tracking down the right .pth file.
Yes, but as you've already pointed out, they've escaped into a larger ecosystem and this restriction is a severe limitation -- leading to significant frustration. Especially as projects evolve and want to do something more complex than simply install pure Python code. Here at Enthought, we use and ship a number of projects that have extensions and thus dynamic libraries that need to either be modified during installation to work from the user's installed location, or copied elsewhere on the system to avoid the need to modify (which we also can't do via an egg install) env variables, registries, etc.
By the way, there *is* experimental shared library building support in setuptools, and I recently heard from Andi Vajda that he was successful in using it in his JCC project to make available a C++ library for linkage from JCC-built projects. (I'm also sitting on his patch that makes it work...) I'm not sure that it actually fixes the larger problem, in that e.g., if the main project is installed by the system, and then you build or install an egg yourself. But I think those problems are solvable.
I'm not sure your description matches what we're trying to do here, but I can figure that out better from looking at the code. Is this in the 0.6 versions or 0.7a? And where should I start looking at module-wise?
We'd also love to be able to ship end-user enterprise-scale applications via eggs so that bug fixes and updates don't require downloading a monolithic 100MB+ installer. But doing that requires the ability to update desktop icons, menus, etc. which we also can't do automatically via an egg.
Yep... a good post-install mechanism would be handy for wx and pywin32 as well.
Enthought has started a project to provide an API abstraction of doing the desktop icon / menu setup on multiple platforms (Windows, Gnome, KDE, and hopefully soon OSX) for both system and user installs. We use it in our EPD product (Enthought Python Distribution.) We could probably work on getting this into a more public form with some hints as to whether it should be done as a plugin, patch, separate project, etc.
If you don't want the burden on setuptools to support, much less track, all these options, then perhaps it could just support automatic execution of a post-install script (and pre-uninstall script if uninstallation ever happens) that allows individual project developers to do what they need to do? Let the burden of describing how those things happen and how to uninstall/relocate/update them fall to the provider of the projects that do them.
Yeah, that's what I really *don't* want. I'd like to enable a more trustable mechanism than a blindly-executed script. I'd rather see a standard that makes a developer document more, and have to at least *convince* the user that their post-install is worthwhile, even if the tool then makes it easy to run.
I'm not sure what you mean by "convince". If you simply mean that the post-install has to default to not doing anything unless the user responds in the affirmative to some prompt, then I guess I could live with that. If you mean that other documentation has to convince them to run a command, then I think that leads to the issue I was directly worried about, which is people complaining because something isn't working because they forgot to run the post-install. My other concern here is how this chains through dependencies. If I install the ETS meta-egg mentioned above, and that causes 4 other eggs to install that all have post-install requirements, I'd hate to have the user have to step through the same sort of prompts 4 times. I guess this is what you were referring to above by a list of post-install tasks, but I just want to be sure.
Better still, I'd rather have those post-install parts done in such a way that things like icons, menus, manifests, registry stuff, etc., have to get explicitly listed instead of being done programatically.
I assume you mean by declaring lists or dictionaries within the setup.py that then get stored as meta-data within a file in EGG-INFO and then get acted on during install. If so, then yes, I'm all for that idea too.
Also, IIUC, stow only tries to "contain" the hard files. It puts links in multiple standard locations (for man pages, executables, libraries, etc.) If setuptools supported these options, I don't think there'd be any discussion here except for things like "how do I extend the set of things the tool supports so that my foo-type files get linked into the standard /os/path/to/foo for the X os?"
Yep. Having that would be a worthwhile thing, I think. Discussion leading to specs is most welcome.
I thought I was starting that already. :-) Or were you saying that it needed to happen somewhere else?
I should have read ahead. This sounds close to what I've been describing except that this leads me to picture a script that prompts for install locations and allows the user to customize the destinations rather than one that assumes everything goes in a standard place. I'm all for this, and the continuation of the ability to install an egg into a user-environment vs. a system-environment.
+1.
The only thing missing here is the ability for the installer to automatically run that script so that installation isn't a disjointed, two-step manual process that a user is prone to forgot to complete.
I don't see a problem with a prompting process, backed by a log file that records what post-install steps are pending, finished, or explicitly rejected by the user.
One possibility, by the way, is that we could overload "extras" for this purpose. Entry points (such as those for scripts) can require extras; if extras could mean post-install components like docs or icons or what-have-you, then trying to run the script could result in an error message telling you you need to "easy_install foo_package[icons]" or whatever.
While I can see many nice things about using extras for delivery of docs, icons, etc. (including reduced size for those who don't want or need them,) I'm not thrilled with the idea of a user getting a message saying to run "easy_install ..." anything for them to be installed. Couldn't we just have the post-install actually run the easy_install command once they accepted the installation of the icons, etc?
One of the features of Enthought's Enstaller extension to easy_install was that it looks for a post_install.py script in EGG-INFO and if one is found, runs it. I would think that getting this into setuptools would be a significant step forward but I believe you previously rejected that idea. We'll take a stab at creating a patch for you if you're more receptive to that idea now. Just let me know.
No -- I'm not happy with a straight-up executable hook for post-install steps. My evaluation of the state of PyPI is that I don't trust the community to write non-hazardous setup.py files, let alone post-install scripts. There should be a high technical and social barrier to including post-install hooks with arbitrary code.
Ouch. That seems a pretty harsh indictment. :-)
For example, if there was a required separation between installer tools and the things they install, such that any post-install operation had to be performed strictly by providing some human-readable data that will be passed to a separately-installed tool, and there was a high social stigma associated with writing your own post-install tool, then that might work.
So, for example, if the community creates an icons and menus installer tool for the various platforms, and then anybody can use it in their project by adding the right data, then the user doesn't have to fully trust arbitrary package authors, only the authors of the post-install tools.
I'm not saying that model is perfect; in fact I can see some potential pitfalls. But once an automatic post-install hole is opened it will be *very* hard to close, because it will always be *easier* to roll your own crappy post-installer instead of contributing to a set of robust cross-project/cross-platform tools. So I'd rather keep this particular "itch" in play and try to build up the scratching pressure until some people get together and pay attention long enough to solve the problem in a less hacky way. :)
I can see what you're saying though I think it cuts off those who need to prove the usecase before writing a tool to support it. Perhaps we'd get more scratching pressure for standardizing (safely) some of these things if people were free to experiment. :-) Anyway, since Enthought is already scratching, I'm fine with the idea of building a standard way to do it that is driven by human-readable data. We just need to setup the process to allow that to happen. So far I haven't seen any responses from you in regards to the setup of an issue/patch tracker, wiki, process to open up the number of commiters, etc. that gives me any confidence I'm not heading off down the wrong path somehow. Perhaps I'm too cautious?
On the other hand, I've been puzzling over how to handle legitimate post-install features. On Windows, both wx and pywin32 have a real need to do some actuall "install" operations. Some is just copying files, but pywin32 also has to do some registry stuff. I don't know how to allow just what's sensible, without opening up a huge can of worms, though.
I think there are lots of situations that are legitimate (projects with extensions, projects that want to put icons on the desktop or in menus, projects that need to interact with a registry, projects that want to put configuration information somewhere other than in a zip file in a site-packages dir, etc.) I think we should worry less about preventing developers from shooting themselves in the foot
It's the users' feet that I'm concerned with. Some people are already paranoid about the fact that PyPI doesn't use SSL and code signing, or that easy_install uses the intarwebs at all. I can just see the witch hunt when we start executing arbitrary code. Unh unh. No way am I letting that happen. Nope.
Though if we had https, code signing, et al, then they'd be trusting the signers of the source anyway and not just "arbitrary code". That doesn't seem bad to me. "If I trust their code to run on my system, why not trust the post-install code as well?"
and more about ensuring that they can hunt for food for their survival.
Right now, if you have a post-install script that's essential, you'll just have to convince your users to run it. Which nicely keeps easy_install out of what should be a conversation between developer and user.
And how do I do that? So few users read the documentation to begin with, or our wiki, or anything else. Is there some meta-data we're able to be put into our eggs / setup.py that displays when the user installs them. And which doesn't scroll by or get buried in an avalanche of cascading dependencies?
Enstaller is a different case - you are presumably installing an application, and the user is trusting your installer. easy_install is something else altogether, and is used by other programs such as buildout.
I think there may be some misunderstanding here. Enstaller is how we are distributing third-party libraries as binaries for a community of users, as well as our own code libraries, and only finally for applications. Yes, you could view it as doing this primarily for larger applications but we have a number of people who use it just to get wx, VTK, etc. on platforms like Windows and OSX, as well as those who use it to get user-space installs of wx, VTK, etc. on Linux. I'm failing to see how trusting Enstaller is different than trusting easy_install. I wouldn't hold either responsible for what happened if I installed a package built by someone else that mis-used the features. Just like I'm careful to not always blame MS if some other application, installed via an .msi, messes up my copy of Windows. :-)
Actually, I wonder if instead of trying to enhance setuptools for post-install, if maybe we should be looking at buildout recipes and maybe having a way for setuptools dependencies to point to buildout specs. IIRC, buildout specs can be remotely retrieved from a single URL, too.
I'll need to read up more on buildout to understand this, but my understanding was that buildout was not something a user ran to install an app, but rather something the developer ran to build and publish an app. The end result of a 'production' buildout is to generate a large tarball or rpm that included everything, right? If so, this goes directly against what Enthought was aiming for, which was to allow delivery of bug-fixes and minor updates in a large app by downloading only smaller units instead of a huge monolithic re-install of everything. Having typed that up though, I'm thinking we're probably abusing eggs for something that rightly ought to be delivered as an application directory scoped patch.
We can always tighten things down after seeing the usecases that develop, right?
Actually, no, we can't, since backward compatibility would keep us from removing the hook, once people rely on it.
I really feel yours (and others) pain on this issue, but it's one place where the users have to come first, and they need protection from the wilds of PyPI. Distribution and installation issues are not first on most developers' minds, so the fact that someone writes a great library on PyPI doesn't mean they can write installers worth a crap. Frankly, I wouldn't trust myself to write a correct post-installer on the first go -- perhaps *because* I have seen so many "simple" things go wrong.
Hell, people can't even write correct code on the first go otherwise we wouldn't have bugs in every app, os, and driver. However, people do fix things over time and eventually get it right or else their project dies because no one wants to deal with the pain. Why is python installation any different? :-) -- Dave

Dave Peterson wrote:
I agree. Let's get that setuptools wiki started and start documenting some of these ideas as a roadmap so that anyone who wants to help out has an idea of what to work on, or factor into what they're currently working on.
Anyway, since Enthought is already scratching, I'm fine with the idea of building a standard way to do it that is driven by human-readable data. We just need to setup the process to allow that to happen. So far I haven't seen any responses from you in regards to the setup of an issue/patch tracker, wiki, process to open up the number of commiters, etc. that gives me any confidence I'm not heading off down the wrong path somehow. Perhaps I'm too cautious?
Dave, I'm in the process of getting a tracker for setuptools, and I'll work on the wiki shortly, although we have the PackagingBOF wiki for idea collection at the moment. Give me a couple of days, including travel from PyCon. I'm fired up to make this happen.
Actually, I wonder if instead of trying to enhance setuptools for post-install, if maybe we should be looking at buildout recipes and maybe having a way for setuptools dependencies to point to buildout specs. IIRC, buildout specs can be remotely retrieved from a single URL, too.
I'll need to read up more on buildout to understand this, but my understanding was that buildout was not something a user ran to install an app, but rather something the developer ran to build and publish an app. The end result of a 'production' buildout is to generate a large tarball or rpm that included everything, right? If so, this goes directly against what Enthought was aiming for, which was to allow delivery of bug-fixes and minor updates in a large app by downloading only smaller units instead of a huge monolithic re-install of everything.
Your view of a fine-grained application bundle with the ability to dynamically download updated eggs without re-pulling the entire thing is an interesting contrast to Paul's view of a more monolithic application for easier add/remove/uninstall completeness. Supporting both usage models is going to be a challenge but I think is feasible with some thought. -Jeff

Jeff Rush wrote:
Dave Peterson wrote:
I agree. Let's get that setuptools wiki started and start documenting some of these ideas as a roadmap so that anyone who wants to help out has an idea of what to work on, or factor into what they're currently working on.
Anyway, since Enthought is already scratching, I'm fine with the idea of building a standard way to do it that is driven by human-readable data. We just need to setup the process to allow that to happen. So far I haven't seen any responses from you in regards to the setup of an issue/patch tracker, wiki, process to open up the number of commiters, etc. that gives me any confidence I'm not heading off down the wrong path somehow. Perhaps I'm too cautious?
Dave, I'm in the process of getting a tracker for setuptools, and I'll work on the wiki shortly, although we have the PackagingBOF wiki for idea collection at the moment. Give me a couple of days, including travel from PyCon. I'm fired up to make this happen.
Awesome. Travis O came back into the office from being at PyCon today and I've finally got the full set of e-mail addys from the people who attended the PackagingBOF meetings. I'd like to e-mail everyone with info about the tracker and wiki, and possibly a new dev type mailing list seeded with these names. Anything I can send out now? :-)
Actually, I wonder if instead of trying to enhance setuptools for post-install, if maybe we should be looking at buildout recipes and maybe having a way for setuptools dependencies to point to buildout specs. IIRC, buildout specs can be remotely retrieved from a single URL, too.
I'll need to read up more on buildout to understand this, but my understanding was that buildout was not something a user ran to install an app, but rather something the developer ran to build and publish an app. The end result of a 'production' buildout is to generate a large tarball or rpm that included everything, right? If so, this goes directly against what Enthought was aiming for, which was to allow delivery of bug-fixes and minor updates in a large app by downloading only smaller units instead of a huge monolithic re-install of everything.
Your view of a fine-grained application bundle with the ability to dynamically download updated eggs without re-pulling the entire thing is an interesting contrast to Paul's view of a more monolithic application for easier add/remove/uninstall completeness. Supporting both usage models is going to be a challenge but I think is feasible with some thought.
Yes, that will be interesting indeed. I'm not sure we need anything more to support what we'd like to do than to resolve some dependency lookup issues that I've already talked about on this list and at the PackagingBOF. -- Dave
-Jeff

At 05:31 PM 3/20/2008 -0500, Jeff Rush wrote:
Dave, I'm in the process of getting a tracker for setuptools,
Can't we just use the Python tracker? I'm already registered in that, and emailing back on issues works. If we can just put me on the nosy list for all setuptools issues, I'll see everything that goes by and comment without having to pop into a browser. If we can't use the Python tracker, let's please use something that lets me comment via email. (And of course, I'll want to subscribe to the wiki via email too, but I assume we'll be using a wiki that allows that.)

Phillip J. Eby wrote:
I'm actually happy to hear that there's this much energy available -- hopefully some of it can be harnessed towards positive solutions.
When I began developing setuptools, I often asked for the input of packagers, developers, etc., through the distutils-sig... and was met with overwhelming silence. So the fact that there is now a group of people who are ready to work for some solutions seems like a positive change, to me.
I can appreciate how frustrating silence is when you call for input. Let's see if we can keep the volunteer energy going this time around.
It's hard to make design decisions regarding itches you don't personally have, and which other people won't help scratch. Unfortunately, a lot of the proposals from packaging system people have been of the form of, "fix this for us by breaking things for other people". Not all of them, though. Many have been very helpful, contributing troubleshooting help and good patches.
That some of those good patches took nearly a year to get into setuptools (some from Fedora just got into 0.6c8 that were sent to me almost a year ago) is because I'm the only person reviewing setuptools patches, and I've spent only a few days in the last year doing focused development work on setuptools (as opposed to answering questions about it on the SIG).
It's never a good thing when people's patches sit around, regardless of where they come from. But that's not the same thing as *rejecting* the patches.
I and others appreciate your call for more patches on various topics. However a long delay in applying them will discourage contribution. Are you open to giving certain others patch view/commit privileges to setuptools? I'd be willing to help out, and keep a carefully balanced hand in what is accepted. -Jeff

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Mar 19, 2008, at 3:57 AM, Jeff Rush wrote:
I and others appreciate your call for more patches on various topics. However a long delay in applying them will discourage contribution. Are you open to giving certain others patch view/commit privileges to setuptools? I'd be willing to help out, and keep a carefully balanced hand in what is accepted.
The Python sandbox has a setuptools directory. Is this the canonical location for the code? If so, then anybody who has Python commit privileges can commit to it and help further develop setuptools. If not, why not and what is the sandbox setuptools used for? - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) iQCVAwUBR+Eca3EjvBPtnXfVAQLabwP9F8NtQX6YsDXJMHiByCGILPAQ2NgtaIzg en6yYbhl5IAweTr0DtWzxRXjSGMifK/D4PmtRSWWUTy3VY+8cRUkYuBjIxPOHJRF 4TA4dYoW4f2+qM1IO/l59FIAJgUyrXKhv3aznpXBFl+PaRCW9qP9G1lur3lolipB h4i8ya+I7zU= =2/iq -----END PGP SIGNATURE-----

The Python sandbox has a setuptools directory. Is this the canonical location for the code?
Yes, it is.
If so, then anybody who has Python commit privileges can commit to it and help further develop setuptools.
They can, but they shouldn't. Nothing should be committed there without pje's approval (in whatever form he choses to give such approval).
If not, why not and what is the sandbox setuptools used for?
I think it shouldn't be in sandbox, but toplevel, but that's a minor detail. Maybe I misunderstand the English word "sandbox". Regards, Martin

At 03:57 AM 3/19/2008 -0500, Jeff Rush wrote:
Are you open to giving certain others patch view/commit privileges to setuptools?
Jim Fulton has such already. I'm open to extending that to others who have a good grasp of the subtleties involved. Truthfully, if we can just get 0.6 put to bed, I could probably open up the trunk a lot wider. One of the things that slows me down is that patches usually don't come with tests, so I usually have to manually smoke-test them for scenarios I think they'll effect. There isn't really any automated procedure. Probably the most frustrating thing (or "chief amongst the most frustrating things") about setuptools development is that it's a black hole. By which I mean that backward compatibility and cruft accretion make it difficult to get out of. In the beginning, there was the distutils. Distutils begat setuptools, and setuptools begat virtualenv and zc.buildout and source control plugins. Etc., etc. What I think is really needed in the long run is to keep eggs, but get rid of setuptools and the distutils in their current form. There's a lot of brokenness there, and also a lot of accumulated cruft. We really need a distutils 3000, and it needs to be built on a better approach. In truth, my *real* motivation for PEP 365's bootstrap tool isn't so much to support the package management tools we have today, as it is to support a new one tomorrow. I have a few ideas for ways to shift the paradigm of how individual projects get built, to incorporate many scenarios that don't work well now. But to implement those things in such a next-generation tool, I will not want to be restricted to just what's in the stdlib or what can be bundled in the tool. (Btw, by "real" motivation, I don't mean I've been deceptive about my intentions, I mean that my strong intuition that such a bootstrap facility is needed, is probably being fueled by the long term desire to replace the entire distutils-based infrastructure with something better.)
I'd be willing to help out, and keep a carefully balanced hand in what is accepted.
And I think it's probably getting close to time I stepped down from day-to-day management of the codebase (which is more like month-to-month or quarter-to-quarter for me lately). It will probably be a lot easier for me to step back and critique stuff that goes in, after the fact, than to go over the stuff beforehand. :) I'm not sure exactly how to go about such a handoff though. My guess is that we need a bug/patch tracker, and a few people to review, test, and apply. Maybe a transitional period during which I just say yea or nay and let others do the test and apply, before opening it up entirely. That way, we can perhaps solidify a few principles that I'd like to have stay in place. (Like no arbitrary post-install code hooks.) btw, offtopic question: are you by any chance the same Jeff Rush who invented EchoMail?

Phillip J. Eby wrote:
At 03:57 AM 3/19/2008 -0500, Jeff Rush wrote:
I'd be willing to help out, and keep a carefully balanced hand in what is accepted.
I'm not sure exactly how to go about such a handoff though. My guess is that we need a bug/patch tracker, and a few people to review, test, and apply. Maybe a transitional period during which I just say yea or nay and let others do the test and apply, before opening it up entirely. That way, we can perhaps solidify a few principles that I'd like to have stay in place. (Like no arbitrary post-install code hooks.)
+1 to blessing more people to commit. +1 to the transition period idea. These two ought to enable things to move a bit quicker than taking a year to accept a patch. :-) In addition to a bug tracker and patch manager, seems like perhaps a wiki to help document some of these solidified principles and other notes would be a good thing. (Like a patch should almost always include at least one test, possibly more.) Given that the source for setuptools is in the python.org svn, couldn't we just use the python.org roundup and wiki for these facilities? Though looking at the list of components, it seems that things in the sandbox generally aren't tracked in this infrastructure. In which case, I'm sure we could use sf, launchpad, or some such external provider. Enthought could even host this stuff. Like Jeff Rush, I'm also willing to help out as both a writer and reviewer of patches. As you can see from my earlier posts there are a number of things (besides running an arbitrary post-install script) that we'd like to be able to get into the codebase. -- Dave

Phillip J. Eby wrote:
At 03:57 AM 3/19/2008 -0500, Jeff Rush wrote:
Are you open to giving certain others patch view/commit privileges to setuptools?
Jim Fulton has such already. I'm open to extending that to others who have a good grasp of the subtleties involved.
Truthfully, if we can just get 0.6 put to bed, I could probably open up the trunk a lot wider.
What is needed to put 0.6 to bed? How can we help accelerate this?
Probably the most frustrating thing (or "chief amongst the most frustrating things") about setuptools development is that it's a black hole. By which I mean that backward compatibility and cruft accretion make it difficult to get out of.
In the beginning, there was the distutils. Distutils begat setuptools, and setuptools begat virtualenv and zc.buildout and source control plugins. Etc., etc.
I've found in the past a revisiting of basic principles and objectives, communicated in enhanced documentation, can help to clear out such black holes. ;-) I'm pulling something together, from the recent emails and some archived threads -- it definitely is tangled though, I'll agree.
What I think is really needed in the long run is to keep eggs, but get rid of setuptools and the distutils in their current form. There's a lot of brokenness there, and also a lot of accumulated cruft. We really need a distutils 3000, and it needs to be built on a better approach.
That will require a lot of concensus building as well as collection of use cases so that the architecture team can encompass aspects they are not personally aware of. As you've said, it's hard to address itches that are not your own. It certainly is possible for someone to create a parallel packaging moduleset that uses the existing eggs format and PyPI but without the currently codebase, and then, once proven to work, lobby for it as distutils 3000. Frankly I'd like to see setuptools exploded, with those parts of general use folded back into the standard library, the creation of a set of non-implementation-specific documents of the distribution formats and behavior, leaving a small core of one implementation of how to do it and the door open for others to compete with their own implementation.
In truth, my *real* motivation for PEP 365's bootstrap tool isn't so much to support the package management tools we have today, as it is to support a new one tomorrow. I have a few ideas for ways to shift the paradigm of how individual projects get built, to incorporate many scenarios that don't work well now. But to implement those things in such a next-generation tool, I will not want to be restricted to just what's in the stdlib or what can be bundled in the tool.
You should document those ideas someplace and start getting community input. There are a lot of diverse opinions on the right way to do this and the way ahead is quite unclear.
And I think it's probably getting close to time I stepped down from day-to-day management of the codebase (which is more like month-to-month or quarter-to-quarter for me lately). It will probably be a lot easier for me to step back and critique stuff that goes in, after the fact, than to go over the stuff beforehand. :)
I'm not sure exactly how to go about such a handoff though. My guess is that we need a bug/patch tracker, and a few people to review, test, and apply. Maybe a transitional period during which I just say yea or nay and let others do the test and apply, before opening it up entirely. That way, we can perhaps solidify a few principles that I'd like to have stay in place. (Like no arbitrary post-install code hooks.)
I'll see about a tracker and identify some people to help out.
btw, offtopic question: are you by any chance the same Jeff Rush who invented EchoMail?
Yep, that's me. Not many remember the Fidonet days. I designed EchoMail on a napkin during a DFW Sysop pizza party during a conversation on what to do with the unused capability of inter-BBS private file transfers. It too escaped its ecosystem and spread like wildfire, almost getting banned from Fidonet. ;-) -Jeff

At 05:15 PM 3/19/2008 -0500, Jeff Rush wrote:
Phillip J. Eby wrote:
At 03:57 AM 3/19/2008 -0500, Jeff Rush wrote:
Are you open to giving certain others patch view/commit privileges to setuptools?
Jim Fulton has such already. I'm open to extending that to others who have a good grasp of the subtleties involved.
Truthfully, if we can just get 0.6 put to bed, I could probably open up the trunk a lot wider.
What is needed to put 0.6 to bed? How can we help accelerate this?
Get a tracker set up. I'm already in the main Python one, might as well use that.
It certainly is possible for someone to create a parallel packaging moduleset that uses the existing eggs format and PyPI but without the currently codebase, and then, once proven to work, lobby for it as distutils 3000.
Yep. And I believe that something will look rather more like zc.buildout than setuptools, actually. Specifically in being data-driven rather than script-driven, and in the flexibility of what sort of parts get build and by what methods. Setuptools is still too rooted in distutils' world, the world where you can't depend on any other components being around to build things with.
Frankly I'd like to see setuptools exploded, with those parts of general use folded back into the standard library, the creation of a set of non-implementation-specific documents of the distribution formats and behavior, leaving a small core of one implementation of how to do it and the door open for others to compete with their own implementation.
Apart from the exploding part, there are already documents. The only thing that makes them implementation-specific is that they haven't passed through any magic blessing process to make them standards.
You should document those ideas someplace and start getting community input. There are a lot of diverse opinions on the right way to do this and the way ahead is quite unclear.
We might be talking about different things, as I'm more concerned with replacing setuptools and distutils on the build-and-distribute side. What's needed there is more the weeding out of too many ways to do simple things, and fixing the complete absence of ways to do complex things. :) For simple things the distutils are too hard, and for slightly-more-complex things, the entry barrier encourages people to abandon and replace them. On the package management side, I'm somewhat more inclined to agree with the need for a community approach, though.
btw, offtopic question: are you by any chance the same Jeff Rush who invented EchoMail?
Yep, that's me. Not many remember the Fidonet days. I designed EchoMail on a napkin during a DFW Sysop pizza party during a conversation on what to do with the unused capability of inter-BBS private file transfers. It too escaped its ecosystem and spread like wildfire, almost getting banned from Fidonet. ;-)
Ah, so you *do* know what it's like to develop setuptools, then. I might even have met you at the one DFW sysop pizza party I ever attended. Back then, I ran the FreeZone, and before that, "Ferris Bueller's Fine Arts Forum", back in the late 80's and early 90's. My wife met me through the D/FW BBS list in the back of Computer Shopper, with a modem she bought at Software, Etc., up in Allen or wherever that place was. Not the chain store, the little consignment shop. Those were the days. But now we're *really* getting off-topic. :)

On Wed, Mar 19, 2008 at 6:15 PM, Jeff Rush <jeff@taupro.com> wrote:
Frankly I'd like to see setuptools exploded, with those parts of general use folded back into the standard library, the creation of a set of non-implementation-specific documents of the distribution formats and behavior, leaving a small core of one implementation of how to do it and the door open for others to compete with their own implementation.
If I hazard an opinion seconding this sentiment. In my use of setuptools, it definitely feels like it wants to be three (mostly) independent projects: 1) The project that standardizes the concept now embodied by eggs and provides the basic machinery to work with them (find them, introspect metadata, "import" them, etc.), but not install them per se. This is generally useful as common plug-in framework, if nothing else. Currently, this "run-time support" functionality is in pkg_resources. 2) The tool you can use to build eggs (but not install them per se). Currently this is the setuptools extension to distutils. 3) The tool for installing eggs (or their equivalent) and (optionally) their dependencies (optionally using remote hosts) as well as uninstalling. Currently this is easy_install (well, except for uninstalling, which is understandable quite difficult). Finally, there is the fourth and already separate project of PyPI: 4) The hosted repository of publicly available eggs (or their equivalent). This should export any metadata required to resolve dependencies relatively cheeply. Breaking them apart will make it easier to have two separate projects for building eggs (or their equivalents) -- one based on distutils and the other replacing it. Even more importantly, it will make it possible for multiple installers to be developed that scratch particular itches. Hopefully one would eventually emerge as the de-facto standard, but this will ultimately be decided by community adoption. Alex

Phillip J. Eby writes:
7. Many wanted to ability to install files anywhere in the install tree and not just under the Python package. Under distutils this was possible but it was removed in setuptools for security reasons.
It wasn't security, it was manageability. Egg-based installation means containment, (analagous to GNU stow) and therefore portability and disposability of plugins. (Which again is what eggs were really developed for in the first place.)
defining containment this way doesn't help when preparing eggs for inclusion in a linux distribution. E.g. users on these distributions are used to find log files in /var/log (maybe in a subdir), documentation in /usr/share/doc/<package name>. You probably will get different views about manageability depending on your background (used to linux distribution standards or used to standards set by setuptools/cheeseshop). Packagers currently move these files manually to the standard locations and often have to keep symlinks in the egg dirs to these locations. Installation on linux distributions is handled by existing package tools which is unlikely to change. So it would be nice to find a common layer which can be used for both distribution methods, optionally enabling this with some kind of option like --install-files-in-places-not-handled-by-setuptools ;) Matthias

I've added your comments to a wiki page (http://wiki.python.org/moin/PackagingBOF) I was working on to summarize some of what went on during these BoF meeting, at least from the Enthought point-of-view. Unfortunately, I wasn't at the first night's event and don't yet have Travis Oliphant's notes on it here in front of me (he's still sprinting) so I only added some more detail to your comments, and also noted some previous issues we'd briefly discussed via e-mail with Phillip. It was great to see so many people interested in sharing their experiences and wanting to help things get better! As you can probably guess as a result of this being a two-night meeting, there wasn't enough time to discuss everything that needed to be brought up. It was suggested that a wiki page be created (see above) and that a new mailing list be setup for those who wanted to discuss further. (Some didn't feel the existing distutils-sig was appropriate.) I'll try to get the latter done shortly. -- Dave Jeff Rush wrote:
I was in a Packaging BoF yesterday and, although not very relevant to the packager bootstrap thread, Guido has asked me to post some of the concerns.
The BoF drew about 15 people...
<snipped>

Jeff Rush writes:
I was in a Packaging BoF yesterday and, although not very relevant to the packager bootstrap thread, Guido has asked me to post some of the concerns.
We did address many topics on both days, I added the following topics which were addressed on the Friday BoF only, see http://wiki.python.org/moin/PackagingBOF - Linux distributions try to ship only one version of a package/egg/module in one release, only shipping more than one version if necessary. eggs (as least as shipped with Debian, Fedora, Ubuntu) are all built using --single-version-externally-managed. - import foo should work wether installed as an egg or installed with distutils, and without using pkg_resources.require - pkg_resources should handle the situation of one egg version installed as --single-version-externally-managed (default version) and one or more eggs installed not using --single-version-externally-managed. Currently these additional versions cannot be imported. - It would be useful if setuptools could handle separate build and install steps like most configure/make/make install systems do. Access to external resources should optionally be disabled during a build. - The idea was brought up to use a to-be-defined api-version to describe dependencies between eggs. Version numbers are generally used for more than api changes; the idea follows existing practice for shared object names, only changing when the API is changed.

On Wed, Mar 19, 2008 at 1:05 PM, Matthias Klose <doko@cs.tu-berlin.de> wrote:
- It would be useful if setuptools could handle separate build and install steps like most configure/make/make install systems do. Access to external resources should optionally be disabled during a build.
What's wrong with "python setup.py bdist_egg"?
participants (8)
-
"Martin v. Löwis"
-
Alexander Michael
-
Barry Warsaw
-
Dave Peterson
-
Jeff Rush
-
Marius Gedminas
-
Matthias Klose
-
Phillip J. Eby