On Sun, Aug 16, 2009 at 8:17 PM, P.J. Eby <pje@telecommunity.com> wrote:
At 02:16 PM 8/16/2009 -0600, Zooko Wilcox-O'Hearn wrote:
So it appears to me that none of these techniques are both modular/ testable and compatible with distutils/setuptools/Distribute.  What
are we to do?

We could be modular if there was a way to specify pre-setup.py dependencies.

For a lot — although admittedly, not all — of the code in question, the only dependency is on some code that lives in the package itself which explicitly avoids depending on anything besides distutils.  That's not to say pre-setup.py dependencies wouldn't be useful, but if we could formalize making that case work (as it would if we could depend on the simplistic environment that a distutils-only 'setup.py install' has), it would go a long way towards fixing the larger problem.

Unfortunately, there isn't such a thing at the moment, short of calling setup() twice, and forcing the first call to have no script arguments, just a setup_requires argument.

So, "modular" is a slippery word.  Let me try to be a little more specific about what I personally want; Zooko can elaborate, and I'm flexible on some of it, but best to start with an ideal.

I have a development environment where sys.path is set up to point at the source code for a set of working branches.  For the purposes of this discussion let's say I've got "Nevow", which contains "nevow/__init__.py", "Twisted", which contains "twisted/__init__.py", and "Tahoe", which... well, actually it contains "src/allmydata/__init__.py" but happily my setup can deal with that.  My sys.path has ["Twisted", "Nevow", "Tahoe/src"] on the end of it.  My $PATH (or %PATH%, as the case may be) has Twisted/bin, Nevow/bin, Tahoe/bin.  I hope this convention is clear.

Now, here's the important point.  I want to run 'trial twisted', which is to say, "~/.../Twisted/bin/trial twisted", and have it load the code from my pre-existing "Twisted/" sys.path entry.  I want to load and examine the distribution metadata, which in the current context means running most of what usually goes in setup.py.  I also want to be able to run parts of the distribution process, to unit-test them, without actually invoking the entire thing.  There are lots of reasons to want this:
  1. It's much faster to skip installation, especially if you're rapidly iterating over changes to a small piece of the distribution setup process
  2. It encourages splitting the distribution process up into smaller pieces ("modularizing" it) so that it can be re-used by other parts of the same project.
  3. It allows for independent testing of those same pieces so that when they are re-used, there is some existing expectation that they will behave as expected that isn't specific to installation of a particular package.
  4. By including it in the package, you allow dependencies of that package to use the packaging functionality as well, so that custom distribution stuff is done consistently across all parts of an ecosystem.
As some of Zooko's links suggest, the way I would prefer to do that is for the distribution metadata to live in a module in 'twisted/', which can be imported by setup.py as a normal python module, and to have setup.py itself look like

from distutils import setup
from twisted.python.distribution import metadata
setup(**metadata)

or even better:

from twisted.python.distribution import autosetup
autosetup()

The buildbot, as it happens, has a similar setup.  There are specific buildslaves that do a full system installation rather than just an 'svn up' before running the tests, to do whole-system integration testing for the installation procedure, but that process is much slower and more disk-intensive, it increases wear and tear on the testing machines, and it takes longer to provide feedback to developers who are sitting idle, so we don't want to have it set up that way everywhere.

Of course, that'd only work if setuptools were present, and it would also force an immediate download of the build dependencies in question.  Something like:

 try:
     from setuptools import Distribution
 except ImportError:
     pass
 else:
     Distribution(dict(setup_requires=[...]))

What goes in the "..." is pretty important.  For one thing, I don't quite understand the implications of this approach.  For another, I really don't want to depend on setuptools, because we certainly need to keep supporting non-setuptools environments.

If you want to get fancy, you could replace the "pass" with printing some user-readable instructions after attempting to see if your build-time dependencies are already present.
 
This strikes me as very non-modular.  If such a message is interesting or important, presumably it needs to be localized, displayed by installers, etc, and therefore belongs in a module somewhere.  Even if that module needs to be bundled along with your application in order to make it work :).

Thanks for reading :).