At 10:27 AM 10/5/2005 +0200, M.-A. Lemburg wrote:
[Some comments on your strategy...]
Phillip J. Eby wrote:
The new setuptools is all nice and easy for end user, but as a package maintainer, I'd like to have the option of building a binary package without all the dependencies.
In the long run, this should be done by packaging the result of bdist_egg, and by default doing bdist_rpm will do this now. In the short term, unless you're switching to an all-egg distribution, you'll probably want to use legacy/unmanaged mode.
I think you are missing his point here:
As package maintainer you *have* to be able to build a distribution package without all the dependency checks being applied - how else would you be able to bootstrap the package in case you have circular dependencies ?
In legacy/unmanaged mode, setuptools' "install" command behaves the way the standard distutils "install" does today, without creating an egg or searching for dependencies.
I don't think that eggs are the solution to everything, so you should at least extend the dependency checking code to have it detect already installed packages (by trying import and looking at __version__ strings) or having an option to tell the system: "this dependency is satisfied, trust me".
There are plans to have a feature like that, and in fact setuptools already has code to hunt down __version__ strings and the like, without even needing to import the packages. It isn't integrated with the rest of the system yet, though. One reason for that is that early feedback suggests that package developers and users would rather have the assurance of having the exact version required by something, as long as the installation process doesn't impose any additional burden on them. Local detection hacks have been primarily requested by packagers, who (quite reasonably) do not want to have to repackage everything as eggs. There is a simple trick that packagers can use to make their legacy packages work as eggs: build .egg-info directories for them in the sys.path directory where the package resides, so that the necessary metadata is present. This does not require the use of .pth files, but it does slow down the process of package discovery for things that do use pkg_resources to locate their dependencies. It also still requires them to repackage existing packages, but doesn't require changing the layout. Also, such packages will currently cause easy_install to warn about conflicting packages if you try to install a different version of the same package, but this will be alleviated soon, as I'm working on a better conflict management mechanism that will allow egg directories on PYTHONPATH to override things in the standard directories. (Currently, eggs are only ever added to the end of sys.path, so if the local packaging system puts .egg-info directories in site-packages, there would be no way to locally override that for an individual user's packages. A future version of setuptools will resolve that issue soon, hopefully in the next few weeks.) As for eggs being the "solution to everything", I would like to point out that what precisely constitutes an egg is an extensible concept. See e.g.: http://mail.python.org/pipermail/distutils-sig/2005-June/004652.html which shows that there are actually three formats that are "eggs" at the moment: 1. .egg zipfiles 2. .egg directories 3. .egg-info marker directories The key requirements for a format to be a pluggable distribution or "egg" are: * Adding it to sys.path must make it importable * It must be possible to discover its PyPI project name (and preferably version and platform) from the filename * It must allow arbitrary data files and directories to be included within packages, and allow arbitrary metadata files and directories to be included for the project as a whole * It must include the standard PKG-INFO metadata These are the absolute minimums, but there are additional specific metadata files and directories that easy_install requires in order to detect possible conflicts, create scripts, etc. Anyway, the point is that what constitutes an "egg" is flexible, but the "add to sys.path and make it importable" requirement certainly limits what formats are practically meaningful. Nonetheless, further extensibility is certainly possible if there's need.
Please make sure that your eggs catch all possible Python binary build dimensions:
* Python version * Python Unicode variant (UCS2, UCS4) * OS name * OS version * Platform architecture (e.g. 32-bit vs. 64-bit)
As far as I know, all of this except the Unicode variant is captured in distutils' get_platform(). And if it's not, it should be, since it affects any other kind of bdist mechanism.
and please also make this scheme extendable, so that it is easy to add more dimensions should they become necessary in the future.
It's extensible by changing the get_platform() and compatible_platform() functions in pkg_resources. By the way, I've issued requests on this list at least twice over the past year for people to provide input about how the platform strings should work; I got no response to either call, so I gave up. Later, when an OS X upgrade created a compatibility problem, somebody finally chipped in with info about what good OS X platform strings might be. I suspect that basically we'll get good platform strings once there are enough people encountering problems with the current ones to suggest a better scheme. :( If you have suggestions, please make them known, and let's get them into the distutils in general, not just our own offshoots thereof.
To make things easier for the user, the install system should be capable of detecting all these dimensions and use appropriate defaults when looking for an egg.
That's done for those dimensions currently handled by get_platform(), and can be changed by changes to get_platform() and compatible_platforms() in pkg_resources.
Please reconsider your use of .pth files - these cause the Python interpreter startup time to increase significantly. If you just have one of those files pointing to your managed installation path used for eggs, that should be fine (although adding that path to PYTHONPATH still beats having a .pth to parse everytime the interpreter fires up).
EasyInstall uses at most one .pth file, to allow packages to be on the path at runtime without needing an explicit 'require()'. However, a vendor creating packages probably doesn't want to have to edit that .pth file, so a trivial alternative is to install a .pth for each package. The tradeoff is startup time versus packager convenience in that case. Having a tool to edit a single .pth file would be good, but not all packaging systems have the ability to run a program at install or uninstall time. If they do, then editing easy-install.pth to add or remove eggs is a better option. Eggs can of course be installed in multi-version mode, in which case no .pth is necessary, but then an explicit require() or a dependency declaration in a setup script is necessary in order to use the package.
If you however install a .pth file for every egg, you'll soon end up with an unreasonable startup time which slows down your whole Python installation - including applications that don't use setuptools or any of the eggs.
A single .pth file is certainly an option, and it's what easy_install itself uses.