[Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals

zooko zooko at zooko.com
Sun Oct 5 18:04:20 CEST 2008

Thanks for the synthesis, Tarek.  I have some experience using  
current Python packaging in the field -- the Tahoe project [1] -- and  
so I would like to throw in what I know of what is currently working  
and what is currently needed and what isn't a big deal to me.

This doesn't, of course, mean that other people might value things  
that I don't, but at least the following opinions of mine are won  
from hard experience.

On Oct 1, 2008, at 6:10 AM, Tarek Ziadé wrote:

> 1/ the dependencies of a package are not expressed in the Require
> metadata of the package most of the time.

+2 -- This is the biggest problem.  The dependencies are not  
expressed *anywhere* in the metadata of the package most of the  
time.  We need a de jure and de facto way to express dependencies so  
that developers will actually write them down.

>    Furthermore, developer tend to use setuptools "install_requires"
> and "tests_require" arguments to express dependencies.
>    So basically, you have to run an egg_info to get them, because the
> info files are generated by commands.

+0 -- I can see how this could be done better, but it isn't a  
pressing problem us.  The current mechanism to get that dependency  
information at build/develop/install time works okay.

> 2/ the existence of PyPI had a side-effect: people tend to push the
> entire doc of the package in one single field (long_description)
>    to display them at PyPI. The documentation of the package is not
> cleary pointed to others.

+0 -- I would like more structured docs because then I could patch  
stdeb [2] to put docs into /usr/share/docs/$PACKAGE on Debian.  But  
it isn't a pressing problem for us (we currently kludge around that  

> 3/ the metadata infos cannot be expressed in a static file by the
> developer, because sometimes they are calculated by code.
>     while this very permissive, that is how it works but they are
> tighted to argument passing to setup().

+0 -- I totally agree that a static, separate, declarative file  
containing just data and no code would be a nicer way to do this.   
But the current way is working for us.

> 4/ PyPI lacks of direct information about dependencies.

+? -- I don't know.  It sounds like it would be a big improvement,  
but the current mechanism of discovering dependencies by downloading  
distributions and executing their setup.py's seems to be working.

> 5/ ideally, they should be one and only one version of a given package
> in an OS-based installation

-1 -- This is the strong preference of the folks who package software  
for OSes -- Debian, Fedora, etc. -- but it is not necessarily the  
choice of the users who use their OSes.  It is best for the Python  
packaging standards to be agnostic towards this, or at least to  
support both this desideratum and its opposite.

> 6/ packagers would like to have more compatibility information to work
> out on security upgrades or version conflicts
> 7/ developers should be able to have more options when they define
> version dependencies in their packages, things like:
>       A depends on B>=1.2 and B<=2.0  but with a preference to B 1.4
> or "avoid B 1.7"
>    they give tips to packagers !

+0 -- If we try to do better than Debian and Fedora already do then  
this risks being a science project -- i.e. something that will take a  
few years and might or might not pan out.  If we try to just ape them  
and learn from their decade's worth of mistakes then this is probably  

>   The developer dependencies infos is a tip and a help for a packager,
> not an enforcement. see [7]

+1 -- In around 95% of the cases that I've seen, the developer's  
dependencies info was good enough.  But, people have to be able to do  
something about the other 5%, so they have to be able to override  
developer-provided dependency information with their own.  Obviously  
they can do this by patching or runtime-patching or maintaining their  
own branch, but we should specify a standard, principled way to do it  

>  11/ people should always upload the sdist version at PyPI,  they
> don't do it always. otherwise it is a pain for packagers.

+1 -- sdist format should be encouraged.

> 1/ let's change the Python Metadata , in order to introduce a better
> dependency system, by
>  - officialy introduce "install requires" and "test requires"  
> metadata in there
>  - mark "requires" as deprecated


> 2/ Let's move part of setuptools code in distutils, to respect  
> those changes.


> 3/ let's create a simple convention : the metadata should be expressed
> in a python module called 'pkginfo.py'
>    where each metadata is a variable.
>    that can be used by setup.py and therefore by any tool that work
> with it, even if it does not run
>    a setup.py command.
>    This is simpler, this is cleaner. you don't have to run some setup
> magic to read them.
>    at least some magic introduces by commands

Uh...  I thought the idea was to *not* have arbitrary Python code  
executed in this part?  How about a flat file that people can  
reliably parse with, say, "grep", to learn about metadata.

>     - a binary distribution cannot be uploaded if a source distrbution
> has not been previously provided for the version
>     - the requires-python need to be present. : come on, you know what
> python versions your package work with !


>    - we should be able to download the metadata of a package without
> downloading the package
>    - PyPI should display the install and test dependencies in the UI
>    - The XML-RPC should provide this new metadata as well.
>     - a commenting system should allow developers and packagers to
> give more infos on a package at PyPI
>      to make the work easier




[1] http://allmydata.org/trac/tahoe
[2] http://stdeb.python-hosting.com/

http://allmydata.org -- Tahoe, the Least-Authority Filesystem
http://allmydata.com -- back up all your files for $5/month

More information about the Catalog-SIG mailing list