zc.buildout and site-packages

Hi, Is there a way (apart from putting buildout in a virtualenv with --no-site- packages) to tell buildout *not* to put site-packages as the first line in the mangled sys.path when it generates scripts? We have people doing horrid things to their global python, and we need the buildout to be safe and isolated in these environments. Martin

On Thu, Oct 22, 2009 at 10:36 AM, Martin Aspeli <optilude@gmail.com> wrote:
Hi,
Is there a way (apart from putting buildout in a virtualenv with --no-site- packages) to tell buildout *not* to put site-packages as the first line in the mangled sys.path when it generates scripts?
Not now, but soon. Gary Poster has implemented this on a branch. I need to review and merge his changes. Jim -- Jim Fulton

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Martin Aspeli wrote:
Hi,
Is there a way (apart from putting buildout in a virtualenv with --no-site- packages) to tell buildout *not* to put site-packages as the first line in the mangled sys.path when it generates scripts?
We have people doing horrid things to their global python, and we need the buildout to be safe and isolated in these environments.
Using a --no-site-packages virtualenv to drive the buildout is a pretty lightweight solution, and easier than the old standby of compiling your own Python to get isolation from the global one, whichstill highly recommended: I build my own Python, and then use a separate virtualenv for each project. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkrgmawACgkQ+gerLs4ltQ5UWACg3RC0QZgyLZ4cBs7+SMhRar6N NvkAnimlAvY7nVfoWBnwxd9ZHIT+lgrO =V4IK -----END PGP SIGNATURE-----

On Oct 22, 2009, at 10:43 AM, Tres Seaver wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Martin Aspeli wrote:
Hi,
Is there a way (apart from putting buildout in a virtualenv with -- no-site- packages) to tell buildout *not* to put site-packages as the first line in the mangled sys.path when it generates scripts?
We have people doing horrid things to their global python, and we need the buildout to be safe and isolated in these environments.
Using a --no-site-packages virtualenv to drive the buildout is a pretty lightweight solution, and easier than the old standby of compiling your own Python to get isolation from the global one, whichstill highly recommended: I build my own Python, and then use a separate virtualenv for each project.
The idea behind Gary's branch (http://svn.zope.org/zc.buildout/branches/gary-4-include-site-packages ) is that unlike the --no-site-packages option of virtualenv, which is all-or-nothing proposition, you would be able to include site-package locations in Buildout's script generation, but care would be taken that if distributions are selected from a site-package location to make sure that when site-package locations are included on sys.path, those locations don't overshadow any other paths pointing explicitly to already picked versions of distributions. e.g. If I was using Apple's System Python on Leopard (10.5), then site-packages includes zope.interface 3.3.0 and bdist_mpkg 0.4.3. If I wanted to pick 'zope.interface == 3.3.0' and 'bdist_mpkg == 0.4.4', then currently Buildout could generate a path modification that looks like: sys.path[0:0] = [ '/System/Library/Frameworks/Python.framework/Versions/2.5/Extras/ lib/python', '/Users/kteague/buildouts/shared/eggs/bdist_mpkg-0.4.4-py2.5.egg', ] Where that System path contains bdist_mpkg 0.4.3. The ordering of whether the site-package location is put before or after version- specific paths is currently dependant upon the ordering in the install_requires field (so you get the correct versions importable if those distributions which are picked from site-packages are listed after the non-site-package picked versions!) - obviously this is just a side-effect of the current path manipulation implementation. One would assume that making this change is fairly easy. Just do a diff between normal sys.path and the site-package free sys.path when Python is launched with the -S flag. Which Gary's code does, but the script generation in Gary's branch right now also accounts for the fact that *.pth files have been processed, and that you are allowed to have import statements executed when *.pth files are processed, so he is generating scripts which also clean sys.modules, and then re-add site-packages locations with site.addsitedir(location) so that .pth files are properly re-processed. Which is pretty fancy, and probably "Does the Right Thing (TM)", but also greatly clutters up the generated scripts. I quite like having script generation generate scripts which are still reasonably compact (I often open generated scripts to see what Buildout is doing, or sometimes edit them to hand-pick a different egg if I want to quickly try out a different working set) and I also wonder how much overhead this additional processing adds (I guess this depends upon how much you have in site-packages). So perhaps if there was some option to still generate scripts using the existing style of script generation - maybe a "i-keep-my-site-packages-clean=true" option ... i dunno, perhaps the other way 'work-around-site-package- madness-in-script-generation=true' ... or just merge Buildout and VirtualEnv into one monolithic project so that you don't need to install two tools just to be able to use Buildout with a dirty Python! (rawr!) Anyways, for those distributions which are tough to install, I think some people will find this branch quite handy in that they can apt-get the tough to install distributions, and then safely include those distributions in working sets composed by Buildout.

On Oct 22, 2009, at 11:08 PM, Kevin Teague wrote:
On Oct 22, 2009, at 10:43 AM, Tres Seaver wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Martin Aspeli wrote:
Hi,
Is there a way (apart from putting buildout in a virtualenv with -- no-site- packages) to tell buildout *not* to put site-packages as the first line in the mangled sys.path when it generates scripts?
We have people doing horrid things to their global python, and we need the buildout to be safe and isolated in these environments.
Using a --no-site-packages virtualenv to drive the buildout is a pretty lightweight solution, and easier than the old standby of compiling your own Python to get isolation from the global one, whichstill highly recommended: I build my own Python, and then use a separate virtualenv for each project.
The idea behind Gary's branch
To be clear, *an* idea. You can also just make a "don't give me what is in site-packages" gesture. (When you do that, in the current branch, the generated scripts still have the complexities you describe below, though.)
(http://svn.zope.org/zc.buildout/branches/gary-4-include-site- packages) is that unlike the --no-site-packages option of virtualenv, which is all-or-nothing proposition, you would be able to include site-package locations in Buildout's script generation, but care would be taken that if distributions are selected from a site-package location to make sure that when site-package locations are included on sys.path, those locations don't overshadow any other paths pointing explicitly to already picked versions of distributions. e.g. If I was using Apple's System Python on Leopard (10.5), then site-packages includes zope.interface 3.3.0 and bdist_mpkg 0.4.3. If I wanted to pick 'zope.interface == 3.3.0' and 'bdist_mpkg == 0.4.4', then currently Buildout could generate a path modification that looks like:
sys.path[0:0] = [ '/System/Library/Frameworks/Python.framework/Versions/2.5/Extras/ lib/python', '/Users/kteague/buildouts/shared/eggs/bdist_mpkg-0.4.4-py2.5.egg', ]
Where that System path contains bdist_mpkg 0.4.3. The ordering of whether the site-package location is put before or after version- specific paths is currently dependant upon the ordering in the install_requires field (so you get the correct versions importable if those distributions which are picked from site-packages are listed after the non-site-package picked versions!) - obviously this is just a side-effect of the current path manipulation implementation.
Not exactly. I was going to go for that, but it was too hard/insane. (Do I need to update some docs on the branch?) If you use this feature, then eggs from site-packages can be inserted cleanly along with other eggs. They can be chosen individually, without masking other eggs. Site-packages-like directories themselves--the directories that are not eggs, but collections of standard directory packages--always go at the end of the sys.path. Otherwise their contents might mask the eggs you chose. What we actually ended up using ourselves (Launchpad) is "don't use any eggs from site-packages, but let site-packages through at the end so we can get some of the non-egg things from it that our system is providing, like Postgres-Python bindings."
One would assume that making this change is fairly easy. Just do a diff between normal sys.path and the site-package free sys.path when Python is launched with the -S flag. Which Gary's code does, but the script generation in Gary's branch right now also accounts for the fact that *.pth files have been processed, and that you are allowed to have import statements executed when *.pth files are processed, so he is generating scripts which also clean sys.modules, and then re-add site-packages locations with site.addsitedir(location) so that .pth files are properly re-processed. Which is pretty fancy, and probably "Does the Right Thing (TM)", but also greatly clutters up the generated scripts.
Mostly right, and granted that the scripts are bigger and more annoying than they are in trunk. FWIW, the "fancy" bits are not primarily because .pth files might import. It's more because the setuptools approach to creating namespace packages in site-packages--that is, the approach that OS distributions typically use--creates fake modules for the namespace packages. These mask any sys.path eggs in the same namespace packages, at least as of c9. We have to clean the fake modules out, set up the sys.path, import pkg_resources because that magically does the right thing for any eggs on the sys.path, and *then* process .pth files. (I hope that PEP 382 is accepted and helps.)
I quite like having script generation generate scripts which are still reasonably compact (I often open generated scripts to see what Buildout is doing, or sometimes edit them to hand-pick a different egg if I want to quickly try out a different working set)
Granted.
and I also wonder how much overhead this additional processing adds (I guess this depends upon how much you have in site-packages).
Any overhead is lost in the cost of importing pkg_resources. Launchpad has a whole bunch of dependencies (~170 eggs last I checked). It's trivial to generate both a ``PYTHONPATH= [...dependencies...] python`` and a faux Python interpreter generated by buildout that does the tricks that you describe. To make the PYTHONPATH approach work with the namespace package problem I described above, you have to hack site.py to import pkg_resources before it processes the .pth files. I compared the two approaches with Launchpad's ~170 dependencies using ``time ${INTERPRETER_CHOICE} -c ''``. They were equivalent in my tests. (FWIW, they were both about 20 times slower than ``time python -c ''``).
So perhaps if there was some option to still generate scripts using the existing style of script generation - maybe a "i-keep-my-site- packages-clean=true" option ... i dunno, perhaps the other way 'work- around-site-package-madness-in-script-generation=true' ... or just merge Buildout and VirtualEnv into one monolithic project so that you don't need to install two tools just to be able to use Buildout with a dirty Python! (rawr!)
I can understand the desire to make it possible to have a simpler script if you want to promise that you are going to have a clean site- packages. I'm not super-excited to add this feature and the related tests, but if that made it possible for my work to not be consigned to a branch forever, I suppose I'd sign up. Jim will be the arbiter there. And, as usual and of course, there are other approaches possible than the one I chose.
Anyways, for those distributions which are tough to install, I think some people will find this branch quite handy in that they can apt- get the tough to install distributions, and then safely include those distributions in working sets composed by Buildout.
Seems to be working for us. Thanks for looking at the branch, and for writing about it. Gary
participants (5)
-
Gary Poster
-
Jim Fulton
-
Kevin Teague
-
Martin Aspeli
-
Tres Seaver