My current task here at Bioreason is to set up a build/distrib system for our projects. We're developing pure Python modules, Python extensions in C/C++, and stand-alone C++ executables. I'm trying to get a basic system that lets use work with these in a standard framework and supports automatic regression tests with the results emailed or sent to a status web page. Also, in order to keep the web pages up to date with the source code, I need to be able to construct some of the pages automatically from the source and README. This is not quite what the distutils-sig does, but I figured it there's a reasonable close fit that I can both contribute and get some ideas. The overview of our system (still in the design phase) is something like this: setup -- sets up the original directory structure for a project This is an interactive/console Python script which asks for the following information (with defaults as needed): project name -- used for the directory and module name product name -- can be different since this is the marketing name alternate product name -- I've found that having a variant for for the product name is useful, eg, the abbrev. or acronym. version number -- in the form \d+(\.\d+)* status -- a string of letters, like "alpha", "p", "RC" cvs root -- we use CVS for our version control, but using "none" specifies not to put this into CVS contact name -- who's in charge of this thing? contact email -- how to contact that person (and where regression test results will be sent) contact url -- where to find more information about this project language -- python, python extension, c++ (eventually more?) (this seems to be the core set of information I need, but the framework will be set up so more can be added as needed.) After this information is gathered, it (will) creates a subdirectory with the given project name and add the following files: info.py -- contains the configuration information configure -- used to generate the Makefile and maybe other files; like a "make.dat" file which is include'd by all Makefiles. buildno -- a number starting at 1 which is incremented for every new build. In this case, a build is not the C level build like Python's C code, but the number passed to QA/testing. test/ -- a subdirectory for the regression tests README -- a pod (?) file containing the template for the standard README. Let me explain that last one some more. It seems that Perl's "pod" format is the easiest to learn and use. I want to put one level on top of that. The hardest thing in my ASCII documentation is to keep version numbers, URLs, etc. in sync with the main code base, since I have to change those by hand. Instead, I propose that during the process of making a source distribution or doing an install, the pod files are read into a Python string and %'ed with the info.py data. Thus, I want a file which looks like: ###### =head1 NAME %(PRODUCT_NAME)s =head1 SYNOPSIS This software does something. =head1 INFORMATION For more information about %(ALT_PRODUCT_NAME)s see: %(CONTACT_URL)s ###### and which will be filled out with the appropriate information as needed for an install. Is there a Python way to read/parse pod files? If not, using perl for this is fine with us. BTW, to better support names, I've also introduced a few other variables from those given in info.py, like: FULL_NAME = PROJECT_NAME + "-" + VERSION if STATUS: FULL_NAME = FULL_NAME + "." + BUILDNO + STATUS so we can have names like: daylight-1.0.alpha9 daylight-1.0.beta3 daylight-1.0.rc1 (for us, "rc" == "release candidate") daylight-1.0 -- build number not included in final release Okay, so once those files are made, if CVS is being used, the whole subdirectory is added to CVS, the directory renamed, and the project pulled out from CVS. Then cd into the directory and run "configure". This produces a Makefile. The Makefile is not put into CVS because supposedly it can always be made by running configure; so edit "configure" instead. The system is now ready for development. I'll assume it is using straight Python scripts with no submodules. In that case, the Makefile supports the following targets: buildversion: increment the "buildno" file by one buildtag: tag everything in CVS with the FULL_NAME build: buildversion buildtag $(MAKE) src.dist tests: cd test; ./testall clean: probably remove the .pyc and .pyo files veryclean: clean probaly remove emacs *~ files install: do the "standard" install and probably a few more. Could someone tell me some other standard target names; eg, those expected from a Python or GNU project? I'm playing around with Makefile suffix rules. What I want to do is make the ".dist" targets get forwarded to Python. Under GNU make I can do something like: %.dist: $(PYTHON) -c "import build; build.distrib('$*')" so "src.dist" becomes: /usr/local/bin/python -c "import build; build.distrib('src')" but I don't know how to do this sort of trick under non-GNU make. Any advice? I mentioned the phrase "standard" install. That's a tricky one. What I envision is that the build module reads the "info.py" file to determine the project language settings then imports the module which handles installs for it. This will probably find all the files with the extension ".py" and pass those to the routine which does the actual installs (eg, copy the files and generate .pyc and .pyo files). However, this must also be able to import project level extensions, for example, to also install some needed data files into the install directory. I'm not sure about the right way to go about doing this. Probably the "build.install.python" will try to import some code from the package, and this will return a list of file names mapped to install methods, like: ('__init__.py', build.install.python_code), # normal python installer ('file.dat', build.install.data_file), # just copies the file ('special.zz', localbuild.install_zz), # something package specific then iterate over the list and apply the function on the file. (And yes, these should likely be class instances and not a 2-ple.) Again, haven't figured this out. Then there's the question of how to deal with submodules. Most likely the configure script will check each subdirectory for an __init__.py file and make the Makefile accordingly. Of course, it will have to ignore certain "well known" ones, like whichever directory contains the project specific build information. The new Makefile will change a few things, like make some targets recurse, as "clean:" clean: probably remove the .pyc and .pyo files cd submodule1; $(MAKE) clean cd submodule2; $(MAKE) clean and add the appropriate Makefiles to those subdirectories. My, this is getting more complicated than I thought it would. Okay, so the final step is the "src.dist" (and "bin.dist" and "rpm-src.dist" and ... targets). I figure that the raw source can always be made available from a "cvs export" followed by a manual tar/gz, so that doesn't need to be automated. What does need to be done is the ability to make a source distribution "for others." For example, at one place I worked we stripped out all the RCS log comments when we did a source build. Or perhaps some of the modules cannot be distributed (eg, they are proprietary). So a basic source distribution must be able to take the existing files, apply any transformations as needed (eg, convert from README in pod form to README in straight text) and tar/gzip the result. We'll be saving the resulting tarball for archival purposes, and our installs will likely be done from this distribution, which means it needs its own Makefile. In all likelyhood, there will be no difference between this Makefile and the normal one. If it is, I guess it would be generated from the configure script although with some special command line option. Then there's questions of how to handle documentation (eg, some of my documentation is in LaTeX). For now, I'll just put stuff under "doc/" and let it lie. Though the configure script should be able to build a top-level Makefile which includes targets for building the documentation and converting into an appropriate form, such as HTML pages for automated updates to our internal web servers. Eh, probably something which forward certain targets to "doc/", like: docs: cd doc; $(MAKE) doc or even do the make suffix forwarding trick, so I can have targets like: user.doc prog.doc install.doc Ugg, to get even more customizable I suppose you would need to tell LaTeX that certain sections should/should not be used when making a distribution, in order to reflect the code. I suppose the configure script could be made to generate a tex file for inclusion, but again, I'm not going to worry about that for now. Most likely the configure script will have the ability for someone to add: make = make + build.makefile.latex_doc( <some set of options?> ) (Thinking about it some, it would be best if "make" were a list of terms, like: make = ( MakeData("tests", "", ("cd test; ./testall",)), MakeData(target, dependencies, (list, of, actions)), ) then the conversion to Makefile routine could double check that there are no duplicate targets, and the package author can fiddle around with the list before emitting the file.) Of course, all of this is unix centric. I know nothing about how to build these types of systems for MS Windows platforms. But then, we don't develop on those platforms, though we will likely distribute python applications for them. That's where the ability to have special ".dist" targets comes in handy. I've considered less what's needed for a shared library Python extension or for pure C++ code. I envison similar mechanisms, but there will need to be support for things like generating dependencies for make which isn't needed for a straight Python scripts. Of course, this also isn't needed for the distutils-sig so I'll not go into them here. I also don't know much about GNU's configure system, which will be more useful this sort of environment. Thus, any solution I give for that problem will likely only be useful for our environment. As I said, this is still in the planning stages for us, so I would like input on these ideas. Of course, I plan to make this framework available, and think part of it -- at least some of the ideas -- will be useful for distutil. Andrew dalke@bioreason.com
Andrew Dalke writes:
configure -- used to generate the Makefile and maybe other files; like a "make.dat" file which is include'd by all Makefiles.
I'd avoid using this name if it isn't an autoconf-generated configure file. If it is, what you want to save is the configure.in file.
Is there a Python way to read/parse pod files? If not, using perl for this is fine with us.
I suspect it would be trivial, but haven't written any POD documentation myself, so I'm probably not the one to do it. ;-)
%.dist: $(PYTHON) -c "import build; build.distrib('$*')" ... but I don't know how to do this sort of trick under non-GNU make. Any advice?
I don't either. Most makes can't do nearly as much as GNU make; the most portable solution is "don't do that". If there are only a few targets that need to be phrased like this (meaning: if the set doesn't change often), just write them out.
As I said, this is still in the planning stages for us, so I would like input on these ideas. Of course, I plan to make this framework available, and think part of it -- at least some of the ideas -- will be useful for distutil.
This sounds like an impressive system. Discussions and collaboration are definately in order. -Fred -- Fred L. Drake, Jr. <fdrake@acm.org> Corporation for National Research Initiatives 1895 Preston White Dr. Reston, VA 20191
configure -- used to generate the Makefile and maybe other files; like a "make.dat" file which is include'd by all Makefiles.
I'd avoid using this name if it isn't an autoconf-generated configure file.
I disagree. Unix people are used to ./configure make make tests make install (sometimes leaving out the "make tests" :) All that file is is an entry into the configuration system, regardless of being generated from autoconf or even hand written. I don't expect that anything I can generate will be useful by everyone for everything, and someday an autoconf type configure script will be needed (esp. for C/C++ code). When that happens, I expect that the new configure should be a drop-in replacement for the existing one, and end users should not notice the change.
I suspect it would be trivial, but haven't written any POD documentation myself, so I'm probably not the one to do it. ;-)
I've not used it either, but it seems to be the best solution available. I'll probably just use pod2text since we know that tool exists on our systems.
Most makes can't do nearly as much as GNU make; the most portable solution is "don't do that".
True enough. On one product I worked on we just shipped the gmake binary (and source) for the different platforms, since we couldn't get the Makefiles working everywhere. That's what taught me to start using $(MAKE) for recusive Makefiles. Still, according to the make documentation, I should be able to have a single suffix rule that works the way I want it to, as in: SUFFIXES: .dist .dist: $(ECHO) Do something but that doesn't work. Luckily, I again get the luxury of designing this for our in-house use, where I can mandate "we will use GNU make for our Makefiles". Andrew dalke@bioreason.com
Quoth Andrew Dalke, on 06 March 1999:
My current task here at Bioreason is to set up a build/distrib system for our projects. We're developing pure Python modules, Python extensions in C/C++, and stand-alone C++ executables. I'm trying to get a basic system that lets use work with these in a standard framework and supports automatic regression tests with the results emailed or sent to a status web page.
Hmmm, from the first paragraph, it sounds like there's the potential for a lot of overlap with Distutils -- or, to look at it more positively, a lot of potential for you to use Distutils! The rest of your post diverges a bit from this; certainly, stock, out-of-the-box Distutils won't be able to handle all the neat stuff you want to do. However, the architecture is quite flexible: module developers will easily be able to add new commands to the system just by defining a "command class" that follows a few easy rules. I posted a design proposal to this list back in January; you might want to give that a look. I finally got around to HTMLizing it and putting it on the Distutils web page a week or so ago, but hadn't announced it yet because I wanted to tweak it a bit more. So much for that -- consider this the announcement. You'll find the design proposal at http://www.python.org/sigs/distutils-sig/design.html By no means is it a comprehensive design; since I've finally started implementing the Distutils (shh! it's still a secret!), I've found all sorts of holes. However, it gives a good idea of the level of flexibility I'm aiming for. Note that one big difference between your idea and Distutils is that we're avoiding dependence on Makefiles: for the most part, they're not needed, and they're unportable as hell (as you're finding out). Make should be used to carry out timestamp-based file dependency analysis, which means it's just great for building C programs. When you use it mainly to bundle up little sequences of shell commands, you're misusing it. (For the ultimate example of Makefile abuse, see any Makefile created by Perl's MakeMaker module. The most astonishing thing is that it all *works*...)
Let me explain that last one some more. It seems that Perl's "pod" format is the easiest to learn and use.
Yes. XML is waaay cool, but pulls in a hell of a lot of baggage. Until that baggage is standard everywhere (hello, Python 1.6 and Perl 5.006! ;-), low-overhead light-weight solutions like pod are probably preferable. Heck, even when all the world is Unicode and XML for everything, low-overhead and light-weight will still be nice (if only to remember how things were, back in the good ol' days).
Is there a Python way to read/parse pod files? If not, using perl for this is fine with us.
As Fred said, it *should* be trivial. It's a bit trickier if you want a really flexible framework for processing pod; see Brad Appleton's Pod::Parser module (available on CPAN) for that. The current generation of pod tools that ship with Perl are getting a bit long-in-the-tooth; they all have their own parsers, and unsurprisingly have diverged somewhat over the years. Hopefully Pod::Parser will start to fix this. My recommendation: spend a few hours learning to use Pod::Parser, and then write your own custom pod tools (yes, in Perl ;-). Slightly less trivial than writing your own custom, limited parser (in either language), or using pod2text, but should be more robust and scalable. Anyways, with any luck the Distutils CVS archive will be a lot busier within a week or so -- so there might actually be something that you (and everyone else on this SIG!) can muck around with. Keeping my fingers crossed (except while writing code) -- Greg -- Greg Ward - software developer gward@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 x287 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913
participants (3)
-
Andrew Dalke
-
Fred L. Drake
-
Greg Ward