[Distutils] thoughts on distutils 1 & 2

Bob Ippolito bob at redivi.com
Fri May 14 11:15:02 EDT 2004


On May 14, 2004, at 10:16 AM, has wrote:

> -- e.g. c.f. Typical OS X application installation procedure (mount 
> disk image and copy single application package to Applications folder; 
> no special tools/actions required) versus typical Windows installation 
> procedure (run InstallShield to put lots of bits into various 
> locations, update Registry, etc.) or typical Unix installation 
> procedure (build everything from source, then move into location). 
> Avoiding overreliance on rigid semi-complex procedures will allow DU2 
> to scale down very well and provide more flexibility in how it scales 
> up.

The problem with this is that Python packages/modules need to go in a 
_specific_ location, where an app can go anywhere.  An installer is 
more appropriate.  Whether that installer is the standard 
Installer.app, or some droplet, doesn't particularly matter.  A droplet 
may be more appropriate because you could have one for each python 
installation, where Installer.app is difficult to wrangle into doing 
anything like that.

> - Every Python module should be distributed, managed and used as a 
> single folder containing ALL resources relating to that module: 
> sub-modules, extensions, documentation (bundled, generated, etc.), 
> tests, examples, etc. (Note: this can be done without affecting 
> backwards-compatibility, which is important.) Similar idea to OS X's 
> package scheme, where all resources for [e.g.] an application are 
> bundled in a single folder, but less formal (no need to hide package 
> contents from user).

People can and usually do this to some extent.  Documentation, 
examples, scripts often go elsewhere because there is no real good 
reason to put them inside the code.  Data files in some cases should be 
decoupled, or at least optionally decoupled, because data files can't 
typically be used in-zip (when using the zip import hook).

> - Question: is there any reason why modules should not be installable 
> via simple drag-n-drop (GUI) or mv (CLI)? A standard policy of "the 
> package IS the module" (see above) would allow a good chunk of both 
> existing and proposed DU "features" to be gotten rid of completely 
> without any loss of "functionality", greatly simplifying both build 
> and install procedures.

They are installable exactly like this (if the user wants to and knows 
where it's supposed to go) except when software needs to be built.

> - Installation and compilation should be separate procedures. Python 
> already compiles .py files to .pyc on demand; is there any reason why 
> .c/.so files couldn't be treated the same? Have a standard 'src' 
> folder containing source files, and have Python's module mechanism 
> look in/for that as part of its search operation when looking for a 
> missing module; c.f. Python's automatic rebuilding of .pyc files from 
> .py files when former isn't found. (Q. How would this folder's 
> contents need to be represented to Python?)

This is a bad idea.  pyc files should be precompiled because it's often 
the case that the user of the .py files does not have access to create 
.pyc files in the same directory.  Building .so files automatically is 
also an intractable problem, you obviously don't do much C programming 
;)

> -- Most packages should not require a setup.py script to install. 
> Users can, of course, employ their own generic shell script/executable 
> to [e.g.] unzip downloaded packages and mv them to their site-packages 
> folder.

I find it easier that all modules and packages use a setup.py.  No 
special cases.

> -- Extensions distributed as source will presumably require some kind 
> of setup script in 'src' folder. Would this need to be a dedicated 
> Python script or would something like a standard makefile be 
> sufficient?

Makefiles are no good.

> -- Build operations should be handled by separate dedicated scripts 
> when necessary. Most packages should only require a generic shell 
> script/executable to zip up package folder and its entire contents 
> (minus .pyc and, optionally, .so files).

What build operations are you talking about?  Source distribution?  
python setup.py dist requires a very simple MANIFEST file that 
describes (by shell globbing) what files should be included/excluded.

> - Remove metadata from setup.py and modules. All metadata should 
> appear in a single location: meta.txt file included in every package 
> folder. Use a single metadata scheme in simple structured nested 
> machine-readable plaintext format (modified Trove); example:

MIME-ish is the Python standard and is what PyPI uses.
Name: Foo
Version: Bar
..

> - Improve version control. Junk current "operators" scheme (=, <, >, 
> >=, <=) as both unnecessarily complex and inadequate (i.e. stating 
> module X requires module Y (>= 1.0) is useless in practice as it's 
> impossible to predict _future_ compatibility). Metadata should support 
> 'Backwards Compatibility' (optional) value indicating earliest version 
> of the module that current version is backwards-compatible with. 
> Dependencies list should declare name and version of each required 
> package (specifically, the version used as package was developed and 
> released). Version control system can then use both values to 
> determine compatibility. Example: if module X is at v1.0 and is 
> backwards-compatible to v0.5, then if module Y lists module X v0.8 as 
> a dependency then X 1.0 will be deemed acceptable, whereas if module Z 
> lists X 0.4.5 as a dependency then X 1.0 will be deemed unacceptable 
> and system should start looking for an older version of X.

If you change the API to the point where it's not compatible anymore, 
you should change the name of the module.  The new distutils 
dependencies stuff only does >=, I don't know where you got this 
"operators" idea.  Maybe from packman?  Packman has absolutely NOTHING 
to do with distutils.  Packman verifies versions with arbitrary Python 
code because it's not feasible to have everyone adopt some standard 
versioning scheme immediately to support OS X users.

> - Make it easier to have multiple installed versions of a module. 
> Ideally this would require including both name and version in each 
> module name so that multiple modules may coexist in same site-packages 
> folder. Note that this naming scheme would require alterations to 
> Python's module import mechanism and would not be directly compatible 
> with older Python versions (users could still use modules with older 
> Pythons, but would need to strip version from module name when 
> installing).

This is much much much easier said than done.  Easy solution: when the 
API changes, rename your package.  That's never done in practice, 
though.

> - Reject PEP 262 (installed packages database). Complex, fragile, 
> duplication of information, single point of failure reminiscent of 
> Windows Registry. Exploit the filesystem instead - any info a separate 
> db system would provide should already be available from each module's 
> metadata.

At least one of these things needs to happen
(a) all python packages and modules are refactored such that their 
metadata can be acquired without side-effects
(b) a central database of this information should be established

If (a) happens, then python's module visibility paths ends up being a 
really really slow and clunky central installed packages database.  
Which is usually good enough, unless you're using NFS or something, in 
which case installing a new package or auditing installed packages can 
take minutes or hours.  This means that (b) should happen nomatter 
what, even if it is a cache of the information acquired from a full run 
of (a).

-bob




More information about the Distutils-SIG mailing list