[Distutils] "Python Package Management Sucks"

Phillip J. Eby pje at telecommunity.com
Thu Oct 2 06:10:01 CEST 2008


At 07:14 PM 10/1/2008 -0700, Toshio Kuratomi wrote:
>In terms of implementation I'd much rather see something less centered
>on the egg being the right way and the filesystem being a secondary
>concern.

Eggs don't have anything to do with it; in Python, it's simply common 
sense to put static resources next to the code that uses them, if you 
want to "write once, run anywhere".  And given Python's strength as 
an interactive development language with no "build" step, having to 
*install* your data files somewhere else on the system to use them 
isn't a *feature* -- not for a developer, anyway.

And our hypothetical de-jure standard won't replace the de-facto 
standard unless it's adopted by developers...  and it won't be 
adopted if it makes their lives harder without a compensating 
benefit.  For the developer, FHS support is a cost, not a benefit, 
and only relevant to a subset of platforms, so the spec should make 
it as transparent for them as possible, if they don't have an 
interest in explicit support for it.  By the STASCTAP principle 
(Simple Things Are Simple, Complex Things Are Possible), it should be 
possible for distros to relocate, and simple for developers not to 
care about it.


>   We should have metadata that tells us where the types of
>resources come from.  When a package is installed on Linux the metadata
>could point locales at file:///usr/share/locale.  When on Windows
>egg:locale (Perhaps the uninstalled case would use this too... that
>depends on how the egg structure and metadata evolves.)
>
>A question we'd have to decide is whether this particular metadata is
>something that should be defined globally or per package.  Or globally
>with a chance for packages to override it.

I think install tools should handle it and keep it out of developers' 
hair.  We should of course distinguish configuration and other 
writable data from static data, not to mention documentation.  Any 
other file-related info is going to have to be optional, if that.  I 
don't really think it's a good idea to ask developers to fill in 
information they don't understand.  A developer who works entirely on 
Windows, for example, is not going to have a clue what to specify for 
FHS stuff, and they absolutely shouldn't have to if all they're doing 
is including some static data.

Even today, there exist Python developers who don't use the distutils 
to distribute their packages, so anything that makes it even more 
difficult than it is today, isn't going to be a viable standard.  The 
closer we can get in ease of use to just tarring up a directory, the 
more viable it'll be.  (That's one reason, btw, why setuptools offers 
revision control support and find_packages() for automating discovery 
of what to include.)


> > I'd have preferred to avoid that complexity, but if the two of us can't
> > agree then there's no way on earth to get a community consensus.
> >
> > Btw, pkg_resources' concept of "metadata" would also need to be
> > relocatable, since e.g. the "EggTranslations" package uses that metadata
> > to store localizations of image resources and message catalogs.  (Other
> > uses of the metadata files also inlcude scripts, dependencies, version
> > info, etc.)
> >
>Actually, we should decide whether we want to support that kind of thing
>within the egg metadata at all.  The other things we've been talking
>about belonging in the metadata are simple key value pairs.
>EggTranslations uses the metadata area as a data store.  (Or in your
>definition, a resource store).  This breaks with the definition of what
>metadata is.  Translations don't store information about a package, they
>store alternate views of data within the package.

I was actually somewhat incorrect in my statement about the 
distinction between pkg_resources "metadata" and "resources"; 
"metadata" is really "data that goes with the distribution, not with 
a specific package within the distribution".  Only some of this data 
is "about" the distribution; the rest is data "with" or "of" the 
distribution.  (This is a slight API wart, but the use case exists 
nonetheless.)

Meanwhile, regarding the proposed key-value pairs system, I don't see 
how that works; "extras" dependency information and entry points are 
a bit more structured than just key-value pairs; both are currently 
represented as .ini-like files with arbitrary section names.  I 
suppose you could squash those entire files into values in some sort 
of key-value system, but that seems a bit hairy to me.  In 
particular, setuptools design choice for separate metadata files is 
that many of these things don't need to be loaded at the same 
time.  Also, PKG-INFO-style metadata can contain rather large blobs 
of text that aren't needed or useful at runtime.  Entry points and 
extras are mostly runtime metadata, with the occasional bit of build 
or install usage.



More information about the Distutils-SIG mailing list