[Python-Dev] Integrate BeautifulSoup into stdlib?

Steve Holden steve at holdenweb.com
Fri Mar 13 14:14:34 CET 2009


R. David Murray wrote:
> On Fri, 13 Mar 2009 at 09:58, Chris Withers wrote:
>> Martin v. L�wis wrote:
>>> >  In light of this, what I'd love to see (but sadly can't really help
>>> >  with, and am not optimistic about happening) is for:
>>> > >  - python to grow a decent, cross platform, package management
>>> system
>>> > >  - the standard library to actually shrink to a point where only
>>> >  libraries that are not released elsewhere are included
>>> > >  I'd be interested to know how many users of python also felt
>>> this way >  ;-)
>>>
>>>  I don't like the standard library to shrink. It's good that batteries
>>>  are included.
>>
>> If a decent package management system *was* included, this wouldn't be
>> an issue..
> 
> I disagree.  One of the jobs I've had is release management for
> internal software projects that depend on various external pieces.
> Release integration tested against specific versions of those external
> packages, and those were the packages that needed to wind up on the system
> when the release was installed.  I've done systems depending on both perl
> and python, and let me tell you, python is way, _way_ easier to manage.
> With python, I have a dependency on a particular python version, and then
> maybe one or two add on packages.  With perl, I have perl, and then I
> have a gadzillion cpan modules.  I don't know how many a gadzillion is,
> because what I wound up doing was making a local copy of the cpan archive,
> checking that in to the repository, and writing up some scripts that made
> sure I pulled the actual install from my cpan snapshot and supported the
> developers in updating that snapshot when we were building a new version.
> (Nor was that the only problem with perl....what idiot decided it was
> OK to interactively prompt for things during a batch install process?!
> And without providing any way to script the answers, at least that I
> could find!)
> 
> So I'm +1 for keeping the Python stdlib as comprehensive as sensible.
> (Please note that last word...I've no objection to pruning things
> that are no longer serving a useful purpose, or that are better
> managed outside the core.)
> 
Just for clarity, when I said a "jumbo distribution" I meant one with
all necessary libraries to run and support a specified set of python
functionalities. The point is *not* to add other libraries (which
invites version creep and undermines stability) but to have everything
you need for a given (core plus) set of functionality.

I believe my original message acknowledged that this wouldn't be
everyone's cup of tea, but if a "good"* set of applications were
analyzed and a distribution built to support just those (presumably
Python) subsystems, you would have a good core that you can augment with
the system-installed Python if you are so minded.

A jumbo shouldn't try to replace the system-installed Python because
hopefully different Python (jumbo) distributions would coexist on the
same system, supporting those Python elements for which their
configuration is acceptable. I would not expect core developers to have
to give the jumbos much mind, except in so far as they might require
support for (slightly?) different versions of Python.

A 1.5.2 process can talk to a 3.1 process without any problems at all as
long as the application protocol is unambiguous. Why this insistence on
trying to do everything with a single interpreter? Sure, it might use
more resources to have three different versions of PIL in use from three
different jumbos, but let's cross that bridge when we come to it.

Naturally, in Python there are already several environments that will
compute the required library subset necessary to support an application,
though at present they do it only across a single Python version and
application. However, writing a simple GUI or command-line program to
call the other Python modules will give you a single analyzable module
tree. You don't even have to distribute the GUI if you don't want ...

So I don't see "jumbo" as replacing "batteries included". More like
"comes with 14v 300AH accumulator and support for domain name and email
services" or "suitable for GeoDjango virtuals under VirtualBox with NAT
addressing". For non-Python stuff (e.g. PostgreSQL, though SQLite is
plenty good enough for experiments) I think the API has to be stable
enough to accommodate some version variations.

regards
 Steve

* This is not a democracy: use your own prejudices to decide.
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
Holden Web LLC                 http://www.holdenweb.com/
Want to know? Come to PyCon - soon! http://us.pycon.org/



More information about the Python-Dev mailing list