
At 01:17 PM 6/15/2005 -0500, Ian Bicking wrote:
Hi -- I'm getting ready to go out of town, so I haven't been able to track everything that's gone on in 0.4 (or 0.5 now?), and I might not be able to follow up on this conversation very thoroughly, but anyway...
I feel like we need several suggested ways to use easy_install and eggs, given different cases. This is probably a documentation task; I'm more interested in how I *should* use this, rather than how I *can*. I guess that's the same as best practices. In turn I'd like to echo these best practices in any documentation or examples that use easy_install/eggs.
Anyway, here's some cases:
Development
* I am developing an application. I don't have a distutil script for the application.
So your application's startup code needs to either require() all the libraries you need, or require an egg representing the application, which then lists all its dependencies. To create an egg representing the application, just create a MyApp.egg-info directory containing a PKG-INFO with a non-empty version number. Then you can create a depends.txt with your dependencies.
* I am simultaneously developing distutils-based libraries. I want to edit these during my development process.
Check out their source, run bdist_egg on them, and add their source directories to PYTHONPATH or a .pth file.
* I also have some distutils libraries that I'm not developing, but I need to install.
Run 'easy_install whatever' to get them.
* I also have some libraries laying around that I already installed (before easy_install existed). And some that are from OS packages.
If you have conflicting packages, you'll have to uninstall them. If they're not conflicting, you can create .egg-info directories alongside the packages with a PKG-INFO indicating their version numbers, in order for you to be able to safely include them in your depends.txt
* I have other Python stuff on my computer that I don't want to mess up because of my ongoing development.
Okay, then you are going to have to use PYTHONPATH or have some kind of hook that your application uses to add locations to sys.path. The built eggs, however, can at least be dumped in the same directory as your app's startup script. Also if you're very careful, you can probably do subversion or CVS checkout setups such that all your development libraries can live in the same directory, with a bunch of .egg-info directories in there as well. This can be a bit tricky to set up, but if you're doing simultaneous development of a bunch of libraries, it might be worth the trouble.
So, given this, what commands do I run? Where do I put files?
My (evolving) thinking on this is the notion of "project directories", where a project directory contains: * A setup.py (optional but recommended) * A setup.cfg (optional, but useful for configuring easy_install) * .egg-info directory and package directory checkouts for all libraries I'm developing as part of the project * All scripts being developed or used by the project * All eggs installed solely for the project's use
What if I don't have permission to put files in global locations (site-packages)?
Use '-d.' and run EasyInstall in the project directory.
How do I clean up after myself later?
Wipe out the project directory. In one of the next phases of EasyInstall development, I plan to add a 'package' script with various subcommands like 'list', 'upgrade', etc. I was already thinking it would be useful to have a 'package create' command that would create a basic project directory for you. Now that I've thought about your scenario here a bit, I think I'd like to also have a 'package develop' that would let you check out packages and their .egg-infos (given some kind of configuration of their CVS/SVN locations) and add them to your depends.txt if they weren't already mentioned. I can also envision a variety of tools springing up in package.py for manipulating the project environment.
Later, this becomes...
Deployment
* I have developed a web application. Maybe it also doesn't have a distutils script...?
Note that with setuptools, a setup script doesn't have to say much besides metadata like name, version, author, etc. You can use the 'find_packages()' function to automatically include all packages (perhaps excepting those that have been somehow marked as being checked out by 'package.py develop'), so it's mostly a matter of listing scripts and such. Note, by the way, that information like author, license, and a number of other things could easily be defaulted by a configuration file, too, if set in e.g. the ~/.pydistutils.cfg file on a per-user basis.
It could, though currently I don't develop one for my web applications. Also, I sometimes make hot fixes, especially when the application is deployed but not yet live.
Surely this could be done by deploying a project directory?
* How and where should libraries be installed? How should application dependencies be expressed?
Same as before: any directory on sys.path, and via depends.txt
* Some libraries are internal, and so aren't available from a public location. Maybe on the web with HTTP auth, though I'm more inclined to simply keep them in a branch in the private repository. Or fetch over scp.
Sure; EasyInstall also supports "find_links" pages that list links to source archives or eggs, so you can use this technique to access them more easily by putting the download pages in your configuration file(s).
* Should I change my require()s to use a specific version of the libraries, so that I don't accidentally upgrade (/break) the application when a later application is installed? How do I manage that process?
That won't prevent breakage, if you end up with a conflict between the versions required by multiple components. Sadly, I have no silver bullet for you regarding management.
* Later a library might be broken and I need to fix it. Is there anything I should do on installation so I can later track who uses what library? Also so I can collect unused versions of libraries.
Commands to do these things should probably be added to the 'package.py' script, once it exists. The information is certainly there.
Non-code
* I'd like to distribute some data that doesn't have much (or maybe any) code. This might be a Javascript library, or a skin for the application (a bunch of templates and images), or something like that. Can I facilitate that with easy_install?
You need at least a zero-length __init__.py that the data lives alongside, but yes. (i.e., using an otherwise-empty package as a data carrier is fine.)
Enabling plugins
* I have library A, and library B. Library B optionally provides a "plugin" to library A, but both are usable in isolation. Library B needs to inject stuff into library A -- i.e., at runtime some code in library B enhances library A. How do I make this work? How do I make library A aware of library B?
I plan to add an 'add_activation_listener()' API to pkg_resources that will let library A get callbacks when new eggs are activated (and will also get callbacks for the eggs that are already activated when the listener is added). This will let library A ask for files in library B's metadata (EGG-INFO/.egg-info) directory, and do any desired automatic registration. Library A will have to define a metadata standard for what filename it looks for and what the contents should be. I expect Paste, Twisted, PEAK, and Zope will probably all want to define such standards. Of course, performance for this is O(nm) where n is the number of listeners and m is the number of eggs, but n will usually be small. However, zipped distributions can check for the existence of files with a simple dictionary lookup, so performance there will be good. At some point we'll need to start pushing people to make their eggs "compression-ready, for enhanced performance", so EasyInstall can be '--zip-ok' by default instead of only doing it for packages distributed as .egg files.
Other People's Code
* Someone wrote some code I'd like to use. But it's poorly packaged -- maybe no setup.py, or maybe a bad one. For instance, I've decided that zpt.sf.net's setup.py is just broken -- you can't use extra_path, no package, and provide an __init__.py all at once. I'd like to write my own setup.py, but use that package. And it's on SF, so I'd like to use easy_install to download the package.
Make your setup.py use 'ez_setup' to bootstrap setuptools. Then create a setuptools.package_index.PackageIndex instance, then call its 'download()' method to retrieve the desired package, and use setuptools.archive_util.extract_archive() to unpack it. Then, write a flag file so that if setup.py is rerun you know you already did all that! Use appropriate directory information in the setup() call so that stuff is built from the extracted package, and of course supply correct metadata in your parent setup script. Voila.
Those are some of the things I'd like to do now -- easy_install doesn't have to magically make all of them work wonderfully; if I have to do things by hand, keep separate records, write custom code, or whatever, that's fine; I just want to know what I should be doing right now for each of these cases. Also, I'm interested in conventions we can define so that we all start doing the same thing.
Luckily, it's pretty magically wonderful for most of what you describe, and for the rest there mostly already exist plans to take care of 'em. Mostly, the support is still a bit weak on the development side of things, but I think we can fix those right up with some "wizards" in package.py to perform common operations like creating a package, setting your default author name and other data, doing CVS/SVN checkouts, etc. (And perhaps the folks doing Python IDE's will start putting menu items in their systems to run these commands, too.) Also, documentation is really lacking right now, so if folks can contribute How-To's, tutorials, and similar material to complement my overview and reference docs, it'll help other people start using all this.