[Python-Dev] Integrate BeautifulSoup into stdlib?

Tres Seaver tseaver at palladion.com
Tue Mar 24 19:15:28 CET 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Cournapeau wrote:
> On Wed, Mar 25, 2009 at 2:20 AM, Tres Seaver <tseaver at palladion.com> wrote:
> 
>> Many of us using setuptools extensively tend to adopt an "isolated
>> environment" strategy (e.g., pip, virtualenv, zc.buildout).  We don't
>> install the packages used by different applications into shared
>> directories at all.  Instead, each environment uses a restricted subset
>> of packages known to work together.
> 
> Is that a working solution when you want to enable easy installation
> on a large number of "customers" ? In those discussions, I often see
> different solutions depending on the kind of projects people do. I
> don't know anything about plone, but I can imagine the deployment
> issues are quite different from the projects I am involved in (numpy
> and co).

Plone is downloaded and installed on many-many systems, across all the
"mainline" platforms.  In each case (since Plone 3.2), the installer is
based on (and includes) zc.buildout, and documents[1] how to add new
bits to the installed Plone by modifying the buildout.cfg file.

> Everytime I tried to understand what buildout was about, I was not
> even sure it could help for my own problems at all. It seems very
> specific to web development - I may completely miss the point ?

I think so:  it is largely a way to get repeatable / scripted deployment
of software to disk.  It uses setuptools to install Python package
distributions, but also can use other means (e.g, configure-make-make
install to install a C library such as libxml2).  The end result is a
self-contained directory tree:

- - Scripts in the 'bin' directory are configured to have the specific
  Python pacakges (and versions) they require on the PYTHONPATH.

- - By convention, released package distributions are installed into the
  'eggs' subdirectory', which is *not* on the PYTHONPATH, nor is it a
  'site' directory for Python.

- - Other bits are typically in their own subdirectories, often under
  'parts'.

> virtualenv, pip, yolk, those are useful tools for development/testing,
> but I don't see how they can help me to make the installation of a
> numpy environment easier on many different kind of platforms.

When not doing Plone / Zope-specific work (where zc.buildout is a de
facto standard), I use 'virtualenv' to create isolated environments into
which I install the libraries for a given application.  If your
application ships as Python package distributions, then documenting the
use of 'virtualenv' as a "supported" way to install it might reduce your
support burden.  You can even ship a virtualenv-derived script which
pre-installs your own packages into such an environment, isolated from
the other pacakges installed on the machine.

>>> If the problem is to get a recent enough version of the library, then
>>> the library would better be installed "locally", for the application.
>>> If it is too much a problem because the application depends on
>>> billions of libraries which are 6 months old, the problem is to allow
>>> such a dependency in the first place. What kind of nightmare would it
>>> be if programs developed in C would required a C library which is 6
>>> months old ? That's exactly what multiple-versions installations
>>> inflict on us. That's great for testing, development. But for
>>> deployment on end-user machines, the whole thing is a failure IMO.
>> It is wildly successful, even on platforms such as Windows, when you
>> abandon the notion that separate applications should be sharing the
>> libaries they need.
> 
> Well, I may not have been clear: I meant that in my experience,
> deploying something with several dependencies was easier with bundling
> than with a mechanism ala setuptools with *system-wide* installation
> of multiple versions of the same library. So I think we agree here:
> depending on something stable (python stdlib + a few well known
> things) system-wide is OK, for anything else, not sharing is easier
> and more robust in the current state of things, at least when one
> needs to stay cross platform.

You can think of zc.buildout or the virtualenv-based script as a form of
bundling, which bootstraps from another already-installed Python, but
remains isolated from anything in its 'site-packages'.

> Almost every deployment problem I got from people using my own
> softwares was related to setuptools, and in particular the multiple
> version thing.

I never even use that switch manually.  zc.buildout does, but that is
because it wants to control the PYTHONPATH by generating the script
code:  it doesn't ask users to tweak that.

> For end-users who know nothing about python package
> mechanism, and do not care about it, that's really a PITA to debug,
> and give bad mouth taste. The fact that those problems happen when my
> software was not even *using* setuptools/etc... was a real deal
> breaker for me, and I am strongly biased against setuptools ever
> since.

I don't know why anybody who was not writing a packaging tool, or
packaging a library for something like .deb / .rpm, would even use the
multi-version option for setuptools:  I don't see any sane way to
install conflicting requirements into a shared 'site-packages'.

>> FHS is something which packagers / distributors care about:  I strongly
>> doubt that the "end users" will ever notice, particularly for silliness
>> like 'bin' vs. 'sbin', or architecture-specific vs. 'noarch' rules.
> 
> That's not silly, and that's a bit of a fallacy. Of course end users
> do not care about the FHS in itself, but following the FHS enables
> good integration in the system, which end users do care about. I like
> finding my doc in /usr/share/doc whatever thing I install - as I am
> sure every window user like to find his installed software in the
> panel control.


>> As a counter-example, I offer the current Plone installer[1], which lays
>> down a bunch of egg-based packages in a cross-platform way (Windows,
>> MacOSX, Linux, BSDs).  It uses zc.buildout, which makes
>> configuration-driven (repeatable) installation of add-ons easy.
> 
> But zc.buildout is not a solution to deploy applications, right ? In
> my understanding, it is a tool to deploy plone instances on
> server/test machines, but that's quite a different problem from
> installing "applications" for users who may not even know what python

In this case, Plone installs as an "application," in the sense that I
think you mean.  It happens to be one which can be extended after
installation via a set of add-ons, which are now mostly expected to be
installed *into* the Plone application environment using zc.buildout.


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJySNA+gerLs4ltQ4RAuX8AJ4pzev40jq9aQcCFM6P3a5+lUyungCghw1p
vSPEudj3quo+mQkiv+QhxCo=
=qjze
-----END PGP SIGNATURE-----



More information about the Python-Dev mailing list