[Distutils] Library instability on PyPI and impact on OpenStack

Thu Feb 28 16:39:41 CET 2013

Hey,

(I sent this to the wrong list and was directed here by Nick. I wasn't
aware of the very promising sounding PEP426 and haven't read it yet, so
apologies on that. Just wanted to resend my mail here ASAP to prevent
the discussion happening on the wrong list. Thanks!)

Generally speaking, when a project has a large list of dependencies on
libraries outside of its control, it can take one of two approaches to
those dependencies:

 1) specify the minimum required version of each library and assume new
    releases of all your dependencies will be backwards compatible with 
    previous versions. You feel safe that if an incompatible version of 
    the library is released, it will be a completely new stream and you 
    can adopt the new stream at your leisure.

    I'm much more familiar with C libraries than any other language. 
    Somehow, C library maintainers seem to understand the need for this 
    approach and so you've got mechanisms like libtool managed sonames
    and parallel installable libraries like gtk, gtk2, gtk3.

 2) specify exactly what version of each library to use, because you 
    assume all of your dependencies are constantly changing their APIs 
    and breaking your app

    This is what you see in the Ruby (bundler) and Java (maven) worlds. 
    For distribution packagers, it presents horrific problems - either
    you accept that you're going to be packaging (potentially many)
    different versions of the same library and that any time a security 
    issue comes up you need to patch each version.

Personally, I tend to pour scorn on Ruby and Java folks for the chaotic
nature that pushes app developers down this "I need to control the whole
stack of dependencies because they're so unstable" path.

OTOH, I hear the Perl community are pretty good about taking the first
approach.

I always felt that the Python community tended more towards the former
approach, but there always exceptions to the rule - to unfairly pick one
one project, sqlalchemy seems to have an API that often changes
incompatibly.

However, OpenStack is starting to get burned more often and some are
advocating taking the second approach to managing our dependencies:

  http://lists.openstack.org/pipermail/openstack-dev/2013-February/thread.html#6014
  http://lists.openstack.org/pipermail/openstack-dev/2013-February/thread.html#6041
  http://lists.openstack.org/pipermail/openstack-dev/2012-November/002075.html

It's probably not worthwhile for everyone to try and read the nuances of
those threads. The tl;dr is we're hurting and hurting bad. Is this a
problem the OpenStack and Python communities want to solve together? Or
does the Python community fundamentally seem themselves as taking the
same approach as the Ruby and Java communities?

Maybe it sounds like I'm trolling, but this is an honest question. What
I'd really like to hear back is "please, please OpenStack keep using the
first approach and we can work through the issues you're seeing and
together make Python better". We can totally do that IMHO.

If people want an example of the kind of stuff things we need to work
on:

  http://lists.openstack.org/pipermail/openstack-dev/2013-February/006048.html

Thanks,
Mark.