[Distutils] People want CPAN :-)

Georg Brandl g.brandl at gmx.net
Sat Nov 7 14:54:09 CET 2009

David Cournapeau schrieb:
> Hi Guido,
> Guido van Rossum wrote:
>> On Fri, Nov 6, 2009 at 2:52 PM, David Lyon <david.lyon at preisshare.net> wrote:
>>> So the packages on CPAN are typically of a higher quality, simply
>>> because they've been machine checked. I like that.
>> Speaking purely on hearsay, I don't believe that. In fact, I've heard
>> plenty of laments about the complete lack of quality control on CPAN.

One thing about CPAN (and Haskell's libraries on hackage) that I think
many people see favorably, even though it's only superficial, is the
more-or-less consistent hierarchical naming for the individual packages
(or the contained modules in Haskell).  Compared with that, the Python
package namespace looks untidy.

> I cannot speak for CPAN, as I have never used it. But CRAN (which is
> CPAN for R) works much better that PyPI today in practice. I am not sure
> what exactly makes it work better, but it has the following properties,
> both technical and more 'social':
>     - R is a niche language, and targets mostly scientists. It is a
> smaller community, more focused. They can push solutions more easily.
>     - There is an extensive doc on how to develop R extensions (you can
> download a 130 pages pdf).

Note that the downloadable distutils manual has 94 pages and *should* be
enough to explain the basics of packaging.  It has to be updated, of
course, once the more advanced mechanisms are part of the core.

>     - R packages are much more constraints: there is a standard source
> organization, which makes for a more consistant experience
>     - There are regular checks of the packages (all the packages are
> daily checked on a build farm on fedora and debian). It also has a
> machine to check windows.
> http://cran.r-project.org/web/checks/check_summary.html
> http://cran.r-project.org/bin/windows/contrib/checkSummaryWin.html
> I am obviously quite excited by Snakebite potential here.

Me too.  Though it would be Snakebite + serious sandboxing.

> Concerning distutils, I think it is important to improve it, but I think
> it is inherently  flawed for serious and repeatable packaging. I have
> written a quite extensive article on it from my point of view as a
> numpy/scipy core developer and release manager
> (http://cournape.wordpress.com/2009/04/01/python-packaging-a-few-observations-cabal-for-a-solution/),

What you're saying there about Cabal is exactly my experience.  It is very
nice to work with, and I've not yet seen a conceptual failure.

But we're not far from that, with a static metadata file.

> I won't rehearse it here, but basically:
>     - distutils is too complex for simple packages, and too inflexible
> for complex ones. Adding new features to distutils is a painful
> experience. Even autotools with its mix of 100 000 lines autogenerated
> shell code, perl, m4 is more pleasant.

Really?  I would have assumed that even writing a whole new distutils
command/build step can't be more painful than adding the equivalent to
an autotools-style build system, being Python after all.  However, I've
never done such a thing, so I have to believe you.

Coming back to Cabal, do you know how easy it is to customize its build

>     - Most simple packages could be "buildable" from purely declarative
> description. This is important IMHO because it means they are simple to
> package by OS vendors, and you can more easily automate building and
> testing.

That's what we're heading towards, I think.


Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

More information about the Distutils-SIG mailing list