[Distutils] EasyInstall --> distutils command + setuptools enhancements?

Phillip J. Eby pje at telecommunity.com
Wed Jun 8 17:56:35 CEST 2005

As I move towards implementing features for EasyInsall like having 
distutils-style configuration files, and refactoring the 'main()' function 
for reusability and extension by other packages, I began to notice 
something: I was basically reinventing a distutils Command object.

Sure, if I subclass Command, I'm stuck with a crummy 'log', and an ancient 
ancestor of 'optparse' (i.e. 'fancy_getopt'), but I get config file parsing 
for free, I get a known structure for arranging things, and I can trivially 
get access to the work done by other distutils commands, like finding out 
where the 'install' command would've installed libraries and scripts.

Not only that, but if I wanted to implement the commands other people have 
been asking for, like to list installed packages, uninstall packages, 
search PyPI, etc., I could just add them to the same harness.  What's more, 
people could add third-party commands using the --command-packages option 
under 2.4 (and I could perhaps backport support into setuptools so that it 
would work with 2.3 as well, at least for EasyInstall).

My main concern with this approach is lengthening the command line, because 
now you'd have to specify a command, resulting in things like this:

     python -m easy_install get SQLObject

The current name seems to clash with the idea of having different commands, 
too.  It almost seems like it should be more like:

     python -m package get SQLObject
     python -m package list
     python -m package delete mechanize
     python -m package check FooBar[someopt,otheropt]
     python -m package download somepkg
     python -m package extract otherpkg

In addition to providing a bit more flexibility, this would let us break up 
the somewhat monolithic structure of the current installer code, so that 
you could use individual pieces as shown, like extracting a package to a 
directory without building or installing it.  Not only would that make 
these commands available to the user, but it would also make them available 
to packages that use setuptools in their setup.py, such that a package 
could e.g. download and extract the source of some C library that it 
depends on, if it's not installed on the user's system.  The 'download' 
command could delegate to 'download_url', 'download_file', and 
'download_req' subcommands, corresponding to the current 
features.  'extract' would delegate to 'download', and so on.  There'd 
probably be a 'setup' command, too, to run a setup script under sandboxing, 

So what are the downsides to this, apart from needing to turn lots of 
little methods into entire Command subclasses?  Well, I imagine it makes it 
harder for somebody to build-in the commands to non-distutils programs, 
unless there's an easy way to create and run the commands, like if I added 
a class method like 'invoke(**args)' to a base class, so that you could say 
stuff like 'package.download_url.invoke(url="whatever")' in order to call 
'em from inside the program.  I've already added something similar to the 
'bdist_egg' comamnd called 'call_command(name,**args)', so that you can 
more easily invoke other commands as subroutines.

Features like those could well be useful for the distutils as well, so 
perhaps 'setuptools.Command' should include them.  Right now, setuptools 
exports the same Command class as the distutils, but I could easily change 
that, so that Commands can be easily created and invoked from non-distutils 

The other downside that needs mitigation is the logging crud.  Distutils is 
spattered with write() calls to sys.stderr and sys.stdout, and it 
inconsistently uses its own 'log.' calls (e.g., it has messages beginning 
with the word "Warning" that nonetheless use 'log.info()' as the 
level!).  It has stuff that depends on a distutils-specific verbosity 
level, and so on.

All of this can and should be fixed in Python 2.5, but EasyInstall needs it 
now, so setuptools will probably do some monkeypatching here and there to 
get 'distutils.log' rerouted to 'logging.getLogger("distutils")', and 
perhaps hack things in such a way that the distutils verbose/quiet stuff is 
only used to change the log level, and and restores the level when commands 
are finished.  I'm not entirely sure how feasible this is.  Some of the 
sys.stderr writes are also not monkeypatchable without replacing giant 
routines, and there are close to 70 print statements spread across 16 
modules.  Even just going through to verify which ones are debug prints is 
a PITA.  However, if someone were trying to integrate these commands into 
some type of GUI application, it might cause occasional problems unless 
stdout and stderr were replaced or redirected somehow.

Still, the bulk of the logging issues can probably be safely handled in a 
couple of monkeypatches to Command, CCompiler, and text_file.TextFile; most 
of the rest occur under pretty obscure circumstances.  So, I think an 80% 
solution is quite doable.

I don't really want to tangle with the fancy_getopt vs. optparse 
distinction, though.  Or any of the other places where a utility module in 
the distutils has been left behind by a newfangled version in the standard 
library.  :(

I mean, on one level it's always tempting to try to write "Distutils 2", 
but at this point there's so much accumulated knowledge in distutils that I 
don't think it's reasonable to expect a replacement to ever be doable in a 
single rewrite.  Refactoring the existing stuff is the only way to go.

Anyway, now I'm digressing.  Thoughts, anyone?

More information about the Distutils-SIG mailing list