Wanted: ideas for using distutils with preprocessors/dependencies
I'm currently looking at integrating bgen with distutils. Bgen is a little-known part of the Python core distribution: it is similar to swig, in that it generates C extension modules. In some respects it is more powerful than swig, the main one being that it reads standard C .h files in stead of adorned .i files. (Incidentally, this is also the reason it's part of core python, it is used to generate the MacOS API modules, so these are (almost) automatically updated when Apple adds new functionality. At least, that was true under MacOS9 and will be again when I get my act together:-). But bgen has a lot of disadvantages when compared to swig, the main one being that it is a rather fearsome tool to try and master. Integration with distutils is one of the things I want to do to lower that barrier. But now I'm at a loss as to how to proceed. I had a look at how swig is integrated into distutils, and I don't really like it, it smells like a hack. And, according to the comments in the source and the manual, the author agrees with me:-) Swig support is basically done in the build_ext command, by filtering out all ".i" files in the source file list very early in the process, running swig on them, and replacing them by the .c or .cpp equivalents. I can see various ways of adding bgen support, but I'm not sure which one is the best one, and/or whether there are other options. So I'd be interested in hearing what other people think, and how other packages have added a preprocessor to distutils. There's a fair amount of Python code needed to drive bgen, at least for interfaces to complex APIs (bridging C types to Python, handling callbacks, how to parse the specific .h files for this API, etc). Currently that code is in two .py files but it will be put in a class, probably modeled somewhat after Extension (but having C/C++ source files as output in stead of dynamic extension modules). What I don't know is how I'd connect this to the Extension object that will create the extension module. Ideally I'd like the bgen process to be optional. In other words, the distribution packager has three options: (a) include the bgen C output in the distribution and don't run bgen unless the end users specifically asks for it; (b) include the bgen C output but only run bgen if the normal timestamp dependencies require it; or (c) always run bgen. But it doesn't seem the Extension object currently has any support for such make-like chaining, and I'm not sure how to add it. One way would be to allow non-strings in the sources argument, and do something smart there. A similar mod could be used for libraries and extra_objects to allow chaining there too. Another way would be to add a "dependencies" argument, where those dependencies are objects that get run early, and can add their results to sources, libraries and extra_objects. I think this latter solution is probably better, as such a dependency object could modify multiple arguments of Extension in one fell swoop. As a somewhat contrived example, an "OptionalJPEGSupport" dependency could check whether the relevant libraries and include files are available to enable JPEG support in an imaging package, and then add the right source files, libraries, defines, library paths and include paths to the relevant Extension arguments. But all of this is made quite a bit more difficult (I think) by the fact the Extension doesn't really do anything, it's only a container and all the logic is in build_ext. Maybe I should follow the paradigm set by "build_clib", and add a "build_bgen" command with build_ext picking up the results? And maybe there are better solutions that I haven't thought of yet? -- Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman
Hi,
I'm currently looking at integrating bgen with distutils.
Bgen is a little-known part of the Python core distribution: it is similar to swig, in that it generates C extension modules. In some respects it is more powerful than swig, the main one being that it reads standard C .h files in stead of adorned .i files. (Incidentally, this is also the reason it's part of core python, it is used to generate the MacOS API modules, so these are (almost) automatically updated when Apple adds new functionality. At least, that was true under MacOS9 and will be again when I get my act together:-). But bgen has a lot of disadvantages when compared to swig, the main one being that it is a rather fearsome tool to try and master. Integration with distutils is one of the things I want to do to lower that barrier.
But now I'm at a loss as to how to proceed. I had a look at how swig is integrated into distutils, and I don't really like it, it smells like a hack. And, according to the comments in the source and the manual, the author agrees with me:-) Swig support is basically done in the build_ext command, by filtering out all ".i" files in the source file list very early in the process, running swig on them, and replacing them by the .c or .cpp equivalents.
I've read distutils SWIG support, and I would (politely) say it is an afterthought. The main problem is that distutils is fixed on a two-stage compilation process: compile and link. Additional processing steps are just not supported.
I can see various ways of adding bgen support, but I'm not sure which one is the best one, and/or whether there are other options. So I'd be interested in hearing what other people think, and how other packages have added a preprocessor to distutils.
There's a fair amount of Python code needed to drive bgen, at least for interfaces to complex APIs (bridging C types to Python, handling callbacks, how to parse the specific .h files for this API, etc). Currently that code is in two .py files but it will be put in a class, probably modeled somewhat after Extension (but having C/C++ source files as output in stead of dynamic extension modules).
What I don't know is how I'd connect this to the Extension object that will create the extension module. Ideally I'd like the bgen process to be optional. In other words, the distribution packager has three options: (a) include the bgen C output in the distribution and don't run bgen unless the end users specifically asks for it; (b) include the bgen C output but only run bgen if the normal timestamp dependencies require it; or (c) always run bgen.
But it doesn't seem the Extension object currently has any support for such make-like chaining, and I'm not sure how to add it. One way would be to allow non-strings in the sources argument, and do something smart there. A similar mod could be used for libraries and extra_objects to allow chaining there too. Another way would be to add a "dependencies" argument, where those dependencies are objects that get run early, and can add their results to sources, libraries and extra_objects. I think this latter solution is probably better, as such a dependency object could modify multiple arguments of Extension in one fell swoop. As a somewhat contrived example, an "OptionalJPEGSupport" dependency could check whether the relevant libraries and include files are available to enable JPEG support in an imaging package, and then add the right source files, libraries, defines, library paths and include paths to the relevant Extension arguments.
But all of this is made quite a bit more difficult (I think) by the fact the Extension doesn't really do anything, it's only a container and all the logic is in build_ext. Maybe I should follow the paradigm set by "build_clib", and add a "build_bgen" command with build_ext picking up the results? And maybe there are better solutions that I haven't thought of yet?
I'd like full blown support for more compilation stages in distutils. I tried to come up with something nonobtrusive and failed. Also, the current stated goal of distutils maintenance seems to be: don't break existing installers. That flatly rules out any sort of rewrite. - Lars
Hi Jack, Jack Jansen wrote:
I'm currently looking at integrating bgen with distutils. ...
But now I'm at a loss as to how to proceed. I had a look at how swig is integrated into distutils, and I don't really like it, it smells like a hack. And, according to the comments in the source and the manual, the author agrees with me:-) Swig support is basically done in the build_ext command, by filtering out all ".i" files in the source file list very early in the process, running swig on them, and replacing them by the .c or .cpp equivalents.
I can see various ways of adding bgen support, but I'm not sure which one is the best one, and/or whether there are other options. So I'd be interested in hearing what other people think, and how other packages have added a preprocessor to distutils.
Due to the natur of distutils, it is easily possible to add a few more stages to the build process here and there. If you just want to do some extra processing before building an extension, the simplest way to hook into the process is by extending build_ext. If you want your command to get automatically checked and processed, you have to subclass the build class itself and add the command as sub-command. For examples on how this can be done, have a look at mxSetup.py which you can find in egenix-mx-base. It has support for auto configuration, building Unix libraries and various other things we needed in distutils. Works great and distutils made it easy to add the new features to our setup.pys. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 18 2005)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
Jack Jansen wrote:
I'm currently looking at integrating bgen with distutils.
Take a look at scipy_distutils. One thing that you can do is pass in a function as the sources argument of Extension. For example, to build the C source from a Pyrex source: def generate_c_from_pyrex(extension, build_dir): name = extension.name.split('.')[-1] source = extension.depends[0] target = os.path.join(build_dir, name+'.c') if newer_group(extension.depends, target): options = Main.CompilationOptions( defaults=Main.default_options, output_file=target) pyrex_result = Main.compile(source, options=options) if pyrex_result.num_errors != 0: raise RuntimeError("%d errors in Pyrex compile" % pyrex_result.num_errors) return target scipy_distutils adds a build_src command that precedes build_ext that executes these functions. It's a bit of a hack, and sometimes is a bit finicky, but it works. http://www.scipy.net/cgi-bin/viewcvsx.cgi/scipy_core/scipy_distutils/ -- Robert Kern rkern@ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter
participants (4)
-
Jack Jansen
-
Lars Immisch
-
M.-A. Lemburg
-
Robert Kern