At 02:08 PM 10/4/2006 -0400, Alexander Michael wrote:
In the past I've managed a shared library of Python packages by using distutils to install them in secondary Library and Scripts directories on a shared network drive. This worked fine, even in our multi-platform environment. With advent of eggs, however, the secondary Library directory must be a formal "Site Directory" and not just on sys.path. The extra delay caused by the layer of network has caused simply getting --help for a simple script to take almost three seconds when it previously only took a tenth of a second. Some scripts that use many packages installed as eggs on the network drive can take as many as 8 seconds just to display the help message.
I would like to install architecture independent Python packages in a single shared location so that everyone using that location is automatically upgraded. The in-house packages are modified about five times a day on average. I would like to take advantage of setuptools versioning (thus using the pkg_resources mechanisms) so deprecated portions of the system can be kept intact in some frozen state of development without having to include the version number in the package name explicitly ( i.e. mymod, mymod2, .., mymod42).
What is the recommended way of using eggs in such an environment?
I'm not sure I understand your question. If you want to avoid the overhead due to network latency, you'd have to put the packages on a local drive. If you want to avoid the additional network round-trips being caused by setuptools looking for packages, you'll need to do away with the eggs (e.g. by installing everything using a package manager like RPM or bdist_wininst, etc.).
I don't think there is any obvious way of accomplishing what you want without some way to "notice" that a newer version of something is available, yet without using the network. That seems to be a contradiction in terms.
The closest thing I know of to what you're doing here is using "setup.py develop" on local revision-control checkouts of shared packages, but that requires that somebody explicitly update changed packages, or at least periodically run a script to do so.
If I were in a situation like yours, I would arrange a revision control setup that allows all the subproject trees to be checked out under a common root, and a script to update each tree and rerun "setup.py develop" if any changes occurred, then leave it to the devs to decide when they want to sync. They could also not run "develop" (or not sync) packages they didn't want to.
I find it hard to imagine using a networked filesystem to import Python code, however, even without eggs being involved, although I've heard rumors that Google does this.
If you have to have a networked filesystem, however, I think you'll have to do without versioning, because it adds too many additional network round trips. The only thing I can think of that could work around this would be some sort of client-side caching of egg contents, so that startups can happen faster.