[Distutils-sig] distutils charter and interscript
On Tue, Dec 01, 1998 at 09:22:15AM +1000, John Skaller wrote:
Interscript is designed to do all this. Except, it has a much wider scope: it isn't limited to building Python, and it includes testing and documentation. (And a few other things -- some of which are implemented already, and some of which are not [such as version control])
There's a generic description of requirements in the documentation at
Cool! I'm looking through the Interscript docs now. I have long thought that Python and Perl would both be eminently suitable for literate programming because of their nifty embedded documentation features. I've never really figured out how you would resolve the conflict between the various target audiences implicit in the conventional ways those embedded documentation standards are used. For instance, in the Perl world, pod documentation is generally targeted at the user of the module, and the better pods provide examples and plenty of explanatory text in addition to the nuts and bolts of "here are the subroutines/methods provided, and here are the parameters they take". My impression of the use of docstrings in the Python world is that because they wind up in the runtime VM code, people tend to make them a lot terser, and only give nuts 'n bolts descriptions of modules, classes, and subroutines. Thus building a useful document for most modules simply by gluing docstrings together would be a dubious prospect. But still, Python docstrings are targeted at the module's users. The third target audience, and a much smaller one, is the people who really want to understand the implementation. It has always been my impression that this was the goal of literate programming: to provide explanations of the data structures and algorithms embodied in the code as a high-tech replacement for poorly-maintained or non-existent comments. The "complete documentation" approach of POD, or the "barebones nuts 'n bolts documentation" of Python docstrings both seem at odds with this. Anyways, this is way off topic. I've always been intrigued by the idea of literate programming, but never really got much past poking around the TeX source and looking (briefly) at CWeb once. I had heard of Interscript from several of the real Python gurus (who I happen to work with), but nobody mentioned that it includes an extension building system!
Almost done: interscript creates a source tree in the doco. It doesn't yet provide the ability to 'tar' up those files, but that is fairly trivial.
I assume this is part of tangling: extract source code to make source files, and then you know what goes into a source distribution. Of course, documentation and test suites also belong in the source distributions, so I guess weaving comes into it as well. Hmmm...
install - install a built library on the local machine
This is MUCH harder. If you read my documentation, and examine the sections on installation control (site-frame, platform-frame, user-frame) you will see I have made fairly advanced provision for installation control. I'm not using any of this yet.
My system discriminates the status of the package, and may install it in different places depending on the status. For example, when I download a new version of a package, I might put it into a test directory and test it, before I install it into a more widely accessible place. Furthermore, the install point is conditioned by the authors evaluation: is it alpha, beta, or production software?
I don't think installation has to be that hard. If you go back and look at the summary of the "Extension Building" Developer's Day session, you'll find a bit about the "blib" directory approach, which I want to steal from Perl's MakeMaker. Basically, ./blib ('build library') is a mock installation tree that looks quite a lot like a subset of /usr/local/lib/perl5/site_perl (or, in the distutils case, will look quite a lot like a subset of /usr/local/lib/python1.x). (Note that this is for varying values of "/usr/local/lib/perl5" -- Perl lets you install its library anywhere, and this information is available via the standard Config module -- "/usr/local/lib/python1.x" -- Python lets you install its library anywhere, and this information *should* be available through some standard module (which I refer to as 'sys.config').) When you build a Perl module distribution, C extensions are compiled and the .so files wind up in ./blib/arch; pure-Perl modules (.pm files) are simply copied into ./blib/lib; and documentation (POD) is converted to *roff format in ./blib/man. The advantage of this is (at least) two-fold: first, running the test suites in a "realistic" environment is trivial: just prepend ./blib/lib and ./blib/arch to Perl's library search path, and run the test programs. Second, installation is trivial: just do recursive copies of ./blib{lib,arch,man} to the appropriate places under /usr/local/lib/perl5/site_perl. (Actually, it's a bit smarter than that: it only copies files that are actually different from corresponding files in the "official" library directory.) MakeMaker also allows users to specify a different installation base than /usr/local/lib/perl5/site_perl, so non-superusers (or superusers who are just messing around) can install things to their home directory, to a temp directory, etc. My idea for the distutils is to blatantly rip off as many of these good ideas as possible, while making them more cross-platform (ie. no requirement for Makefiles) and a bit more object-oriented. However, most of the ideas carry over quite cleanly from Perl to Python.
gen_make - generate a Makefile to do some of the above tasks (mainly 'build', for developer convenience and efficiency)
Please don't! Please generate PYTHON script if it is necessary. Python runs on all platforms, and it has -- or should have -- access to all the configuration information.
Yech! I can't imagine why you'd want to generate a Python script -- Python is such a dynamic, module-oriented, introspective language that it seems crazy to have to generate Python code. Just write enough modules, and make them flexible and well-documented, so that every needed build/package/install task is trivial. Should be doable. Please note: ***NO ONE WILL HAVE TO USE MAKE FOR ANY OF THIS*** The reason I propose a 'gen_make' option is largely for the convenience of developers writing collections of C extensions under Unix. It won't be needed for people writing single-.py-file-distributions, it won't be needed (but it might be nice) for people writing single-.c-file- distributions, and it most certainly will be not be needed for people just installing the stuff. However, the people who write collections of C extensions under Unix are a small but crucial segment of the Python world. If we can convince these people to use distutils, we win. It would be of great convenience for these folks if the distutils can generate a Makefile that does everything (or almost everything) they need. MakeMaker can do it -- so why can't we? Anyways, I completely agree with your statements about Make being unreliable, unportable, flaky, and a bear to debug. I also agree that we have something better available; that's why the whole proposal is built around something called 'setup.py'. The generation of makefiles is just an optional feature for a small but important segment of the population. There appears to be growing support for writing a next-generation 'make' in Python. Great idea, but I don't think this is the place for that; if such a beast does come into existence, then we should add a 'gen_ngmake' command to distutils, but examining time-based dependences amongst files is not really germane to most of this discussion. It's certainly something that people writing collections of C extensions have to worry about, and those of them using Unix have a solution -- just not a very satisfactory (or portable) one.
For example, the compilers module I provide happens to use gnu gcc/g++. So it won't work on NT. Or a Solaris system where the client wants to use the native cc. Etc etc.
OK, that's a problem. But as you said in your first message, we should look more at your interface than your implementation. Do you think your implementation could be easily extended to work with more compilers. (Most importantly, it should work with the compiler and compiler flags used to build Python. If you don't have access to that information and don't use it, then you can't build properly build extensions to be dynamically loaded by Python. That, incidentally, is why I think a patch might be needed for 1.5.2 -- it would probably be a patch to the configure/build stuff, and the addition of whatever code is needed to make a builtin 'sys.config' module which provides access to everything Python's configure/build process knows. The intention is that this stuff would be standard in 1.6.)
Yes. Although interscript allows a slightly different approach: test code is embedded directly in the source and is executed during the build process. The test code isn't installed anywhere; it's written to a temporary file for testing and then deleted afterwards.
Cool! That's a neat idea -- had never occured to me. Not sure if it belongs in the "standard Python way of doing things", though. Do you find that having test code intermingled with production code greatly increases the size of things? I've released a couple of fair-sized collections of Perl modules complete with test suites, and I wrote roughly as much test code as production code. I'm not sure if I'd want to wade through the test code at the same time as the production code, but I can certainly see the ease-of-access benefits: add a feature to the production code, and immediately add a test for it (in addition to documenting it).
* a standard for representing and comparing version numbers
I think this is very hard. It isn't enough to just use numbers. RCS/CVS provides a tree. With labelled cut points. AFAIK, it cannot handle the important thing: dependencies.
RCS/CVS (and SCCS, for that matter) address the developer; version numbers should address the user. And they should be in an obvious linear sequence, or your users will get terribly confused. There are a whole bunch of possible ways to do version numbering; I'm inclined towards the GNU style (1.2.3) with optional alpha/beta tags (eg. "1.5.2a2"). Coincentally, this seems to be Guido's version numbering system for Python... The only two other widely-known models I can think of offhand are Linux (basically GNU, but with the stable/development version wrinkle) and Perl (where version numbers can be compared as floating point numbers -- which leads to the madness of the current Perl version being 5.00502!). I don't think either of these are appropriate. OK, enough for now. I have to spend some more time digesting the Interscript docs, and then I'll try to read the rest of your email in detail. Thanks for picking apart my proposal -- I was getting worried that everyone would agree with everything I proposed, and I'd be stuck with writing all the code. ;-) Greg -- Greg Ward - software developer gward@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 x287 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913
On Tue, 1 Dec 1998, Greg Ward wrote:
My impression of the use of docstrings in the Python world is that because they wind up in the runtime VM code, people tend to make them a lot terser,
Side note -- only docstrings which are the first statements in a class or function or module are kept -- everything else is thrown out currently. --david
participants (3)
-
David Ascher
-
Greg Ward
-
John Skaller