[Python-Dev] Integrate BeautifulSoup into stdlib?

Toshio Kuratomi a.badger at gmail.com
Tue Mar 24 20:11:29 CET 2009


David Cournapeau wrote:
> On Wed, Mar 25, 2009 at 1:45 AM, Toshio Kuratomi <a.badger at gmail.com>
wrote:
>> David Cournapeau wrote:
>>> 2009/3/24 Toshio Kuratomi <a.badger at gmail.com>:
>>>> Steve Holden wrote:
>>>>
>>>>> Seems to me that while all this is fine for developers and Python
users
>>>>> it's completely unsatisfactory for people who just want to use Python
>>>>> applications. For them it's much easier if each application comes with
>>>>> all dependencies including the interpreter.
>>>>>
>>>>> This may seem wasteful, but it removes many of the version
compatibility
>>>>> issues that otherwise bog things down.
>>>>>
>>>> The upfront cost of bundling is lower but the maintenance cost is
>>>> higher.  For instance, OS vendors have developed many ways of being
>>>> notified of and dealing with security issues.  If there's a security
>>>> issue with gtkmozdev and the python bindings to it have to be
>>>> recompiled, OS vendors will be alerted to it and have the
opportunity to
>>>> release updates on zero day, the day that the security announcement
goes
>>>> out.
>>> I don't think bundling should be compared to depending on the system
>>> libraries, but as a lesser evil compared to requiring multiple,
>>> system-wide installed libraries.
>>>
>> Well.. I'm not so sure it's even a win there.  If the libraries are
>> installed system-wide, at least the consumer of the application knows:
>>
>> 1) Where to find all the libraries to audit the versions when a security
>> issue is announced.
>> 2) That the library is unforked from upstream.
>> 3) That all the consumers of the library version have a central location
>> to collaborate on announcing fixes to the library.
>
> Yes, those are problems, but installing multi libraries have a lot of
> problems too:
>  - quickly, by enabling multiple version installed, people become very
> sloppy to handle versions of the dependencies, and this increases a
> lot the number of libraries installed - so the advantages above for
> system-wide installation  becomes intractable quite quickly

This is somewhat true.  Sloppiness and increased libraries are bad.  But
there are checks on this sloppiness.  Distributions, for instance, are
quite active about porting software to use only a subset of versions.
So in the open source world, there's a large number of players
interested in keeping the number of versions down.  Using multiple
libraries will point people at where work needs to be done whereas
bundling hides it behind the monolithic bundle.

>  - bundling also supports a real user-case which cannot be solved by
> rpm/deb AFAIK: installation without administration privileges.

This is only sortof true.  You can install rpms into a local directory
without root privileges with a commandline switch.  But rpm/deb are
optimized for system administrators so the documentation on doing this
is not well done.  There can also be code issues with doing things this
way but those issues can affect bundled apps as well. And finally, since
rpm's primary use is installing systems, the toolset around it builds
systems.  So it's a lot easier to build a private root filesystem than
it is to cherrypick a single package.  It should be possible to create a
tool that merges a system rpmdb and a user's local rpmdb using the
existing API but I'm not aware of any applications built to do that yet.

>  - multi-version installation give very fragile systems. That's
> actually my number one complain in python: setuptools has caused me
> numerous headache, and I got many bug reports because you often do not
> know why one version was loaded instead of another one.
>
I won't argue for setuptools' implementation of multi-version.  It
sucks.  But multi-version can be done well.  Sonames in C libraries are
a simple system that does this better.

> So I am not so convinced multiple-version is better than bundling - I
> can see how it sometimes can be, but I am not sure those are that
> important in practice.
>
Bundling is always harmful.  Whether multiple versioning is any better
is certainly debatable :-)

>> No.  This is a social problem.  Good source control only helps if I am
>> tracking upstream's trunk so I'm aware of the direction that their
>> changes are headed.  But there's a wide range of reasons that
>> application developers that bundle libraries don't do that:
>>
>> 1) not enough time in a day.  I'm working full-time on making my
>> application better.  Plus I have to update all these bundled libraries
>> from time to time, testing that the updates don't break anything.  I
>> don't have time to track trunk for all these libraries -- I barely have
>> time to track releases.
>
> Yes, but in that case, there is nothing you can do. Putting everything
> in one project is always easier than splitting into modules, coding
> and deployment-wise. That's just one side of the speed of development
> vs maintenance issue IMHO.
>
>> 3) This doesn't help with the fact that my bundled version of the
>> library and your bundled version of the library are being developed in
>> isolation from each other.  This needs central coordination which people
>> who believe bundling libraries are very unlikely to pursue.
>
> As above, I think that in that case, it will happen whatever tools are
> available, so it is not a case worth being pursued.
>
I'm confused -- if it will happen whatever tools are available, how does
"good source control" solve the issue?  I'm saying that this is not an
issue that can be solved by having good source control... it's a social
issue that has to be solved by people learning to avoid bad practices.

-Toshio

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-dev/attachments/20090324/275e64fa/attachment-0001.pgp>


More information about the Python-Dev mailing list