[Distutils] Compiler abstractiom model
Greg Ward
gward@cnri.reston.va.us
Mon, 29 Mar 1999 21:35:13 -0500
Hi all --
I've finally done some thinking and scribbling on how to build
extensions -- well, C/C++ extensions for CPython. Java extensions for
JPython will have to wait, but they are definitely looming on the
horizon as something Distutils will have to handle.
Anyways, here are the conclusions I've arrived at.
* Stick with C/C++ for now; don't worry about other languages (yet).
That way we can be smart about C/C++ things like preprocessor
tokens and macros, include directories, shared vs static
libraries, source and object files, etc.
* At the highest level, we should just be able to say "I know nothing,
just give me a compiler object". This implies a factory function
returning instances of concrete classes derived from an abstract
CCompiler class. These compiler objects must know how to:
- compile .c -> .o (or local equivalent)
- compile multiple .c's to matching .o's
- be able to define/undefine preprocessor macros/tokens
- be able to supply preprocessor search directories
- link multiple .o's to static library (libfoo.a, or local equiv.)
- link multiple .o's to shared library (libfoo.so, or local equiv.)
- link multiple .o's to shared object (foo.so, or local equiv.)
- for all link steps:
+ be able to supply explicit libraries (/foo/bar/libbaz.a)
+ be able to supply implicit libraries (-lbaz)
+ be able to supply search directories for implicit libraries
- do all this with timestamp-based dependency analysis
(non-trivial because it requires analyzing header dependencies!)
Linking to static/shared libraries and dependency analysis are
optional for now; everything else is required to build C/C++
extensions for Python. (At least that's my impression!)
"Local equivalent" is meant to encompass different filenames for C++
(eg. .C -> .o) and different operating systems/compilers (eg. .c ->
.obj, multiple .obj's to foo.dll or foo.lib)
BIG QUESTION: I know this will work on Unix, and from my distant
recollections of past work on other systems, it should work on MS-DOS
and VMS too. I gather that Windows is pretty derivative of MS-DOS, so
will this model work for Windows compilers too? Do we have to worry
about Windows compilers other than VC++? But I have *no clue* about
Macintosh compilers -- presumably somebody "out there" (not necessarily
on this SIG, but I hope so!) knows how to compile Python on the Mac, so
hopefully it's possible to compile Python extensions on the Mac. But
will this compiler abstraction model work there?
Brushing that moment of self-doubt aside, here's a proposed interface
for CCompiler and derived classes.
define_macro (name [, value])
define a preprocessor macro or token; this will affect all
invocations of the 'compile()' method
undefine_macro (name)
undefine a preprocessor macro or token
add_include_dir (dir)
add 'dir' to the list of directories that will be searched by
the preprocessor for header files
set_include_dir ([dirs])
reset the list of preprocessor search directories; 'dirs' should
be a list or tuple of directory names; if not supplied, the list
is cleared
compile (source, define=macro_list, undef=names, include_dirs=dirs)
compile source file(s). 'source' may be a sequence of source
filenames, all of which will be compiled, or a single filename to
compile. The optional 'define', 'undef', and 'include_dirs'
named parameters all augment the lists setup by the above four
methods. 'macro_list' is a list of either 2-tuples
(macro_name, value) or bare macro names. 'names' is a list of
macro names, and 'dirs' a list of directories.
add_lib (libname)
add a library name to the list of implicit libraries ("-lfoo")
to link with
set_libs ([libnames])
reset the list of implicit libraries (or clear if 'libnames'
not supplied)
add_lib_dir (dir)
add a directory to the list of library search directories
("-L/foo/bar/baz") used when we link
set_lib_dirs ([dirs])
reset (or clear) the list of library search directorie
link_shared_object (objects, shared_object,
libs=libnames, lib_dirs=dirs)
link a set of object files together to create a shared object file.
The optional 'libs' and 'lib_dirs' parameters only augment the
lists setup by the previous four methods.
Things to think about: should there be explicit support for "explicit
libraries" (eg. where you put "/foo/bar/libbaz.a" on the command line
instead of trusting "-lbaz" to figure it out)? I don't think we can
expect the caller to put them in the 'objects' list, because the
filenames are too system-dependent. My inclination, as you could
probably guess, would be to add methods 'add_explicit_lib()' and
'set_explicit_libs()', and a named parameter 'explicit_libs' to
'link_shared_objects()'.
Also, there would have to be methods to support creating static and
shared libraries: I would call them 'link_static_lib()' and
'link_shared_lib()'. They would have the same interface as
'link_shared_object()', except the output filename would of course have
to be handled differently. (To illustrate: on Unix-y systems,
passing shared_object='foo' to 'link_shared_object()' would result in an
output file 'foo.so'. But passing output_lib='foo' to
'link_shared_lib()' would result in 'libfoo.so', and passing it to
'link_static_lib()' would result in 'libfoo.a'.
So, to all the Windows and Mac experts out there: will this cover it?
Can the variations in filename conventions and compilation/link schemes
all be shoved under this umbrella? Or is it back to the drawing board?
Thanks for your comments!
Greg
--
Greg Ward - software developer gward@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive voice: +1-703-620-8990 x287
Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913