[Distutils] thoughts on distutils 1 & 2

Mark W. Alexander slash at dotnetslash.net
Wed May 19 10:26:22 EDT 2004

On Tue, May 18, 2004 at 11:29:27PM +0100, has wrote:
> - Regarding standard package managers on *nix systems (which I've not 
> used): does DU intend to wrap these, or plug into them? Current 
> impression is it's the former; in which case, what's the reasoning 
> for this? I'd have thought it'd make more sense for a system's native 
> package manager to "be in charge", as it were, with DU only acting 
> under their control. BTW, I've no objection to folks using package 
> managers that want to use them; equally, I don't believe folks should 
> _have_ to use them (e.g. because runaway complexity or rampant 
> lock-in makes any other approach impossible). DU should be able to 
> scale up as far as the user wants; but it shouldn't start "high up". 
> Nothing's worse than systems that bury themselves (and their users!) 
> under the weight of their own complexity.

Distutils "wraps" native package managers in the sense that it can use
native package management tools to produce native binary packages for a
specific target. This is, however, only one possible method. Others
would include libraries (i.e. rpmlib) or some other method of producing
a file that the package manager understands. For example a "binary
package" is usually a standard archive file (such as cpio or ar archive)
that contains specific files that the package manager uses for control
(i.e., CONTROL for .debs).

The reasoning is consistent software configuration management. DU does
not require that you produce binary packages. You can just "python
setup.py install" where ever you want. Or you could "python setup.py
bdist_dumb" to produce a binary compatible tarball. By producing
platform native binary packages, Distutils automatically "scales" as far
as the platform scales.

As far as "lock-in" goes, the platforms include their package management
so you're no more locked in the that tool than you are to the platform.
Distutils reduces that lock-in by supporting the instant re-packaging
into binary packages for a different platform's manager.

> - Regarding target layout: I realise it's not DU's position to 
> _dictate_ layout to systems. What I'm thinking is the packaging 
> scheme should be 1. simple, 2. standardised, 3. self-contained, 4. a 
> good 'default' layout. Distributing every Python module/extension as 
> a Python package (aside from eliminating the confusion over what 
> "package" actually means, since the word is currently used to mean 
> two different things) would allow DU to invert the current 
> distribution format and thereby eliminate a level of folder nesting 
> (simpler) and provide a more convenient 'all-in-one' format to kick 
> around on a typical system without losing anything (self-contained). 
> Folk who are happy with that layout (e.g. would the average 
> Mac/Windows user want/need to break the package up?) need do nothing 
> more (good 'default' layout). Folk on *nix systems who want to move 
> the 'doc' folder to /usr/share/doc/ are free to do so (though copying 
> or symlinking it might be better), and, given a standardised layout, 
> a generic installer script could easily perform that operation. 
> (Ironically, I don't think it's the system's position to _dictate_ 
> layout to users either... but arguing with *nix's OCD tendencies is, 
> I suppose, ultimately fruitless; it is what it is. [Roll on 
> relational file systems...:p])

This, too, is addressed be effective metadata management. Specifiying
which files are package/module files, which are doc files, which are
data files and which are config files supports bdist_* implementations
that do the right thing for the installation target. None of this
eliminates the "generic installer" concept. Both "python setup.py
install" and bdist_dump can be considered "generic installers". I think
the only thing that would be required is, perhaps, a distutils utility
function available to modules that would find a module configuration
file so it can adjust itself at runtime. For example:

from distutils import configure

could setup appropriate path variables for config, data and doc files.
Package/module files need to go to the "right" place (site-packages or
somewhere that the user can put in PYTHONPATH). Distutils can't futz
with package/module locations without breaking the import mechanism.
(I think this is more relevant to applications that contain application
specific packages than general library packages.)

> - Regarding "integration": there are good and bad ways to integrate 
> systems. Examples of good: small, distributed, single-task components 
> linkable via unix pipes, Apple events, etc. Examples of bad: vast, 
> centralised (Soviet bloc-style), "do-it-all" frameworks full of 
> wrapper classes round every other system imaginable and more kitchen 
> sinks than you can count; inevitably end up riddled with ultra-tight 
> coupling, dependencies up the wazoo, and for all the supposed "power" 
> never quite manage to do what you want (inflexible). (e.g. See Python 
> web framework arena for examples of latter.)

This is, I think, the distinction between "applications" and library
packages. Eroaster comes to mind as an "application". On Debian,
eroaster has /usr/lib/eroaster for modules and /usr/share/eroaster for
icons. It's not likely that a python developer is going to do an import
of an eroaster module because the components are too application
specific to be of general use. It doesn't make sense to me to plop the
eroaster modules in site-packages. Applications are a tightly integrated
set of modules that must all be in place for the application to
function. General purpose library packages would be your "single-task"
components. These need to be treated differently.

> - On splitting various roles of setup.py into individual scripts 
> (/src/build.py, /doc/build.py, etc.): aim isn't directly to simplify 
> things for developer/user. It's to decouple each functional unit of 
> DU, and establish small, simple, open, "generic" interfaces between 
> them. This will make each DU component easier for DU developers to 
> work on, and easier for DU users to "plug-n-play" with. e.g. DU has a 
> good extension building component that could and should stand on its 
> own (it may even find uses outside of DU itself), and be easily 
> replaceable with alternative build components (based on makefiles, 
> scons, etc.); this "ext_build" component would simply become "one of 
> the boys" - a peer to all the others, rather than their master (with 
> all the additional responsibility and complexity that involves). 
> Create "modest", not "Macbeth", code.

Separating into individual scripts isn't any easier or more intuitive
than providing subclass-able objects. Distutils commands and processes
are subclass-able and customizable now, although (last time I looked)
it's not documented well enough for a non-Distutils hacker to leverage.

> - On setup.py providing a very useful indication that "this is a 
> DU-installable package": good point; noted. Any system where setup.py 
> wasn't ubiquitous would want to provide an equivalent "user-readable 
> flag", as it were. (Standardised file+folder structure, presence of 
> standard metadata and readme files, etc.)

Personally, I find nothing more annoying than a source package that
includes ./configure that is not the autoconf-style ./configure that it
appears to be at a glance. I think it's important that setup.py remain
not only as an indicator that the package is Distutils "enabled" but
that it is Distutils "consistent."

> - Regarding manifests: stupid, brain-dead, error-prone, 
> anal-retentive, make-work garbage. These _should_ be eliminated. This 
> will do two things: 1. allow a very simple, sensible "default" 
> package building behaviour to be instituted (i.e. zip the entire 
> folder by default); 2. allow for more intelligent customisation of 
> package building behaviour by DU users, who should be able to give 
> "smart" instructions like "build package including every item whose 
> extension is not in ["pyc", "so"]", instead of having to provide and 
> tediously maintain a "dumb" list of filenames.

I've done several packages and never once manually created a manifest.
Doing a setup.py for someone else's package, it did "intellegently
exclude required .xml files. I agree that absent a manifest file,
everything in the source tree should be transfered to the build/install

> - On metadata formats: don't really care what format is used as long 
> as it is simple, human-readable and -writeable, easily 
> machine-parseable, and sufficiently powerful to represent package 
> metadata. (e.g. Not sure if standard Mime-like format used for 
> PKG-INFO is up to the task: can it do nested structures and lists? 
> Format I suggested can do this, and is simple enough that its 
> 'non-standardness' should not present any problems for adoption.) Oh, 
> and a standard meta.txt file in every package means PKG-INFO can also 
> be gotten rid of (it's nowt but a weak, clumsy bit of duplicated data 
> anyway).

What would be the purpose of nesting? If it's supporting
related-but-independent sub-packages, I'm for it. I agree that the
metadata format is irrelevant as long as it meets the criteria you note.

> - On PEP262 elimination: again, if users want to create and maintain 
> their own database of installed packages using existing package 
> manager tools then that's their choice. What's important is that this 
> should not be the "default" arrangement, for reasons I've already 
> given (synchronisation issues, etc.). By putting metadata into every 
> Python package in the filesystem and building an API for retrieving 
> that data directly from those packages, you have a solid 
> "lowest-common-denominator" foundation that can also meet the 
> majority of users' needs for no additional effort. For folk who like 
> to maintain a separate DB for efficiency (e.g. on NFS), it also 
> provides them the means to easily build that DB given an existing 
> installation, not to mention an easy way to rebuild it after it goes 
> blooey!

Sounds like including setup.cfg in the installation tree would
accomplish what you describe.

> p.s. If anyone can point me to good examples of DU-based binary 
> distributions, I'd like to take a look to help my understanding, 
> thanks.

Unfortunately, I don't have authorization to release any of mine,
however, in the current architecture _every_ distutils based distribution
_is_ binary capable. Everything I've needed, I've been able to produce a
binary distribution for Solaris* with:

    python setup.py bdist_pkgtool \ 
        [--pkg-abrev="pkgtool's stupid 8 char max name"]

For HP-UX:
    python setup.py bdist_sdux

For rpm:
    python setup.py bdist_rpm

For Debian (cheating ;):
    python setup.py bdist_rpm
    alien --to-deb [resulting.rpm]

Whether the package author _intended_ for the packages to be binary
installable or not has made _no_difference_at_all_. THIS IS THE POWER OF
DISTUTILS: The package author doesn't need to know or care what the
target platform will be, what the binary package format is, or how to
produce them. They don't even have to know what native package managers
exist.  If there's a setup.py, Distutils "just works" to produce a
native binary package. So for an example of a "DU-based binary
distribution", see _any_ DU-based package.

*_still_ waiting management approval to release bdist_pkgtool and
Mark W. Alexander
slash at dotnetslash.net

The contents of this message authored by Mark W. Alexander are
released under the Creative Commons Attribution-NonCommercial license.
Copyright of quoted materials are retained by the original author(s).


More information about the Distutils-SIG mailing list