
Hi all --
in the last couple of weeks, two people (Amos Latteier & Paul Dubois, to be specific) have asked about better programmatic access to the 'setup()' function. The two cases are sort of opposite but closely related:
* Amos wants a "dist-bot" that pulls Python module distributions off the net, unpacks them, and extracts their meta-data. Running "python setup.py --name --version ..." through a pipe is an option, but it would be much nicer to import (or execfile) setup.py and query it programmatically.
* Paul wants to run a setup script with specific arguments from Python code. Again, he could just spawn "python setup.py ...", or hack sys.argv and call 'setup()' directly. But those are both kind of gross.
In both cases, what's needed is a bit of "backdoor" access that skirts around the usual convention: setup script encodes useful information in keyword arguments, which are then passed into the guts of the Distutils by calling 'setup()'.
I think Paul's backdoor is easiest: add a special keyword argument, 'argv', to 'setup()', which overrides sys.argv. Or a possible variation on this: call it 'args', and have it override just sys.argv[1:]. Subtle distinction, but important. I'm not sure offhand which is better, but I think it would be excessive to support both.
For Amos, I can see a couple ways to do it. I think the cleanest is a global variable that tells 'setup()', when to stop. (It can't be another keyword arg, because you're just execfile'ing any old setup script: you can't get in there and add arguments to the 'setup()' call.) The default, of course, would be to do everything: create the Distribution instance, parse the config files and command line, and run all commands (modulo help options, of course). But you could stop after initialization (Distribution object with just the info from the setup script), after config-file parsing, or after command-line parsing.
Of course, there would be a function to set the global variable, so you might do this:
distutils.core.stop_after('init') execfile("foo-1.0/setup.py")
But that's only half the story: you really want to get your hands on the Distribution instance that 'setup()' created. The obvious answer is another global variable, and another function to read it. Yuck. Or rather, getting yucky; not bad enough to out-and-out refuse it, but I'd like to know if anyone has a better idea before I go and implement this.
So: any better ideas out there? Does anyone think these backdoors are useful enough to slip in, or should we make Paul and Amos live with whatever gross hacks they're using currently? (I hate gross hacks.)
Greg

My not wanting to run "python setup.py ..." in a pipe has more than just persnickity wishes for elegance behind it.
(a) I would have to write setup.py somewhere, with all the usual annoyances connected with creating a temporary file; not serious but annoying.
(b) I don't know WHICH python will get run by the above statement. In particular I might not get the one interpreting me now; again, solvable but annoyingly so. In my case this would be a VERY bad error, as I would install something into the wrong Python.

Greg Ward wrote:
Hi all --
in the last couple of weeks, two people (Amos Latteier & Paul Dubois, to be specific) have asked about better programmatic access to the 'setup()' function. The two cases are sort of opposite but closely related:
...
In both cases, what's needed is a bit of "backdoor" access that skirts around the usual convention: setup script encodes useful information in keyword arguments, which are then passed into the guts of the Distutils by calling 'setup()'.
I've thought about this a little. It seems to me that the basic desire is to be able to run setup from python, not the command line. I think we could accommodate this with something like this
distutils.core.run_setup(file, commands)
This would return the results of running the named setup file with the given commands. I don't really care about how you spell the file and the commands. For example, perhaps file should be an open file object rather than a path, and maybe commands is a list of strings instead of a string. Whatever ;-)
This facility seems relatively easy to understand and implement. And I believe that it accomplishes most of what Paul and I want.
The remaining problem is that most commands don't return anything, they just print information. This means that when you run a command on a setup file from python, it won't be easy to figure out whether it worked or not.
I haven't looked at the commands very much but perhaps they could be retrofitted to return something in addition to printing information. Otherwise there could be new commands that were meant to be called from python. Or failing that you could capture the stdout and try to deduce from there how things went.
In my case I'll probably write a new command called something like 'metadata' that doesn't do anything but return meta-data about the distribution.
This would allow me to do something like this:
meta_data=distutils.core.run_setup(open('foo-1.0.tgz'), 'metadata')
which is what I want.
-Amos
-- Amos Latteier mailto:amos@digicool.com Digital Creations http://www.digicool.com

On 28 August 2000, Amos Latteier said:
I've thought about this a little. It seems to me that the basic desire is to be able to run setup from python, not the command line. I think we could accommodate this with something like this
distutils.core.run_setup(file, commands)
Hmmm, that does look cleaner for your needs than setting a global mode variable. (Of course, it might end up setting a global mode variable behind the scenes, but never mind that. If I break 'setup()' up into its constituent parts, or farm more of it off to the Distribution class, that shouldn't be necessary.)
However, since I just went ahead and implemented 'script_name' and 'script_args' arguments to 'setup()', this is not needed for Paul's case.
There's still a fundamental problem: the interesting info is encoded into keyword arguments, and the only (clean) way to get ahold of keyword arguments is to be the function that is called. That means that somewhere along the lines, the function 'distutils.core.setup()' is called with the keyword args in the setup script. To get our hands on what's passed to 'setup()', we either have to:
* write a custom 'setup()' and replace the "real" 'distutils.core.setup()' with it (yuck)
* modify the existing 'distutils.core.setup()' so that some external stimulus -- ie. something not passed in as keyword args from the setup script -- can influence its behaviour, eg. tell it to stop processing once it has "parsed" the keyword args and created a Distribution object. When I see "external stimulus" I think either "global variable" or "instance attribute"; since 'setup()' is a function, it has to be the former. Merde!
This would return the results of running the named setup file with the given commands. I don't really care about how you spell the file and the commands. For example, perhaps file should be an open file object rather than a path, and maybe commands is a list of strings instead of a string. Whatever ;-)
What it returns depends on how far you want it to run. For you, it's enough to construct the Distribution object and return it (along with its slave DistributionMetadata object, of course). In some cases, you might want to do everything up to running the commands; other times, you actually want to run the commands. Again, this can't be specified in the keyword args in the setup script, which implies some external stimulus is needed.
The remaining problem is that most commands don't return anything, they just print information. This means that when you run a command on a setup file from python, it won't be easy to figure out whether it worked or not.
Well, the Distutils commands are all about side-effect; what could be more side-effect-ful than dumping a bunch of stuff in /usr/lib? ;-)
However, they're not completely unresponsive: several commands (notably the build_* and install_* families) have 'get_inputs()' and/or 'get_outputs()' methods that return lists of filenames. This is a very informal, not-widely-implemented interface -- ie. it gets implemented when it's needed. They're essential to support the 'bdist_*' commands. Generally, you need to actually run a command to ask it what its outputs were, so these methods aren't quite as useful as they could be.
In my case I'll probably write a new command called something like 'metadata' that doesn't do anything but return meta-data about the distribution.
We've been down this road before! The outcome was the DistributionMetadata class, defined in dist.py. Once you have a Distribution object, you can query its metadata using 'get_description()', 'get_fullname()', etc.
This would allow me to do something like this:
meta_data=distutils.core.run_setup(open('foo-1.0.tgz'), 'metadata')
which is what I want.
I think I would spell this:
dist = distutils.core.run_setup('foo-1.0/setup.py', stop_after="init")
and then you can do this:
print "%s (version %s), provided by %s" % (dist.get_name(), dist.get_version(), dist.get_contact())
Seem reasonable?
Greg
participants (3)
-
Amos Latteier
-
Greg Ward
-
Paul F. Dubois