[PYTHON C++-SIG] LLNL Python/C++ integration: current status

Geoffrey Furnish furnish at laura.llnl.gov
Thu Feb 13 00:36:55 CET 1997


Greetings to all again.  This message is being reposted because a
configuration probelm with the list was uncovered.  Please excuse.

This message is to report on the current status of the work being done
on Python/C++ integration in my research group at LLNL.  There are
surely others working in this general area, I hope you will all post a
summary note of your current situation so we can all take stock, and
benefit from having our perspectives widened.

To get things rolling, I will provide some comment directly on the
points listed in the SIG charter, and them probably ramble some more
after that as well.

 > -------------------------------------------------------------------
 > C++-SIG, SIG for Development of a C++ Binding to Python
 > 
 > This list exists in order to discuss the design and development of a
 > C++ binding for the compiled interface to Python, and associated
 > classes and tools for the construction of Python extension modules in
 > C++.
 > 
 > Major issues to be discussed/resolved, include, but are not limited
 > to:
 > 
 > 1) Autoconf support for enabling C++ support in Python.  This must be
 >    managed in a way which does not change the C API, or the behavior
 >    of Python if not configured for C++ support, in any (observable?)
 >    way.  In other words, conventional C extension modules must
 >    continue to work, whether Python is configured with C++ support or
 >    not.

One of the first things which has to be faced is how to build a Python
which can support C++.  On most systems, this means in particular,
that "main" must be compiled with the C++ compiler.  Moreover, lacking
platform ABI's for C++, python must be built with a particular C++
compiler--the one you will use for your extension work.  You can't mix
and match .o's from C++ compilers the way you (often) can with .o's
from C compilers.

To cope with this, I modified our configure to support a new option,
		--with-cxx[=compiler_name]
If you just say, for instance,
	./configure --with-cxx
it will find whatever C++ compiler the autoconf macro comes up with,
which may or may not be the one you want.  If it is not the one you
want, then add the "=<compiler>" part.  For instance, if you are using
the Kuck and Associates C++ compiler (for instance), you would say:
	./configure --with-cxx=KCC

This will then produce a makefile setup which will compile Python and
link it in such a way that the main is compiled with C++, and the link
is handled by C++.

It is important to understand that the link must be done by C++, since
not only do we need to get global constructors run (the usual and
fairly well appreciated excuse, for which there exist some
platform-specific workarounds), but also because we will need to let
the compiler do template closure, for which there is really no
substitute for just letting the compiler do the job.

While I was at it, I added a --with-debug flag to configure.  I
observed that there was a way to get approximately this effect using
variable specification at make time under Guido's original autoconf
stuff, but it was not really adequate to handle the case of an
executable which is built with multiple compilers, since they may not
all take the same flags to turn on debugging.  For example, most
compilers take -g, but KCC takes "+K0".  If you configure
- --with-debug, then you get debugging support in both C and C++
compilations.  It may be necessary to tune this autoconf stuff for
popular compilers.

Anyway, the result of this is that autoconf now allows building a
python, either with or without C++ support.  Without it, you get
exactly the python you would've gotten in the first place.  With it,
you get a better python, one that can support C++ extension modules.

I should add at this point that in our early efforts, we also modified
portions of the build to support the compilation of additional parts
of the python core with C++ because of the desire for 5) below.
However, we backed that out, and are using an extension module for
those capabilities instead.  More on that later.

 > 2) Type safety in the Python API.  

Not a tremendous amount of work in this direction yet.

 > 3) Introducing C++ classes for the various major Python abstractions,
 >    such as dictionaries, lists, tuples, modules, etc.

PyXXX_YYY(...) ??? How about instead, 
	XXX v;
	v->YYY(...);

Essentially we are wrapping the Python API with a set of objects which
will have methods which correspond to the various portions of the C
API.  This will generally eliminate the first PyObject * argument to
most of these C API functions, since they can be identified
immediately as the "this" object, and the API call is most naturally
considered a method invocation on that object.  Our goal is to make
Python really look this way to the C++ programmer.  Planned classes at
this time include:

PyDict
PyList
PyTuple

and possible also PyArgs, although this is still under discussion
since arglists are always tuples now.  I am personally inclined to
keep PyArgs as a seperate class (perhaps derived from PyTuple, not
sure), just for the sake of specficity in callback function
signatures. 

We have not set down and specifically carved up the entire Python C
API function universe, but we expect that the above classes will get
us well down the road.  We would welcome suggestions about what else
is needed.

 > 4) Providing semantics for these classes which are natural both to the
 >    Python programmer, as well as the C++ programmer.  For example, the
 >    PyDict class should have an operator[], and should also support STL
 >    style iterators.  Many issues of this form.

This is just starting, nothing quite report-worthy yet.

 > 5) Method invocation.  How to make the invocation of C++ member
 >    functions as natural and/or painless as possible.

Our use of Python involves providing a script interface to object
oriented class libraries written in C++.  Integration/wrapper
technologies which are geared toward wrapping API's as Python modules,
are not overly useful to us, since we for the most part, do not even
have global functions in our compiled subsystems.  We have
classes/objects.  Our Python extension modules, coded in C++, need to
be able to reflect this structure.  Building classes which wrap the
real ones, but have Python-esque signatures, seems to be one
productive way to procede.  However, Python can only register
callbacks for global C functions.  So then you wind up writing C style
global functions which call C++ member functions which call the C++
services in the ultimate compiled assets.  This is clumsy and we would
like to avoid the extra layer if possible.

To accomplish this, we have built a C++ extension module which
provides the capability to invoke C++ member functions directly from
Python.  This is accomplished by mimicking the "function object"
support in Python, with a new "member function object" type, which has
the internal data necessary to allow a C++ method invocation.  The
Py_FindMethod() function has been succeeded by a new getattr assister
named FindMemberFunc().  This function builds a C++ member function
object corresponding to the requested member function for a C++ python
extension class.  When Python calls this member function object, a
trampoline function is called which extracts the necessary data from
member function object, which allows it to invoke the C++ member
function on the object.

This eliminates the need for the self parameter in the call back,
since it is replaced by C++'s implicit "this".  So, you derive your
shadow object from PyObject, and this allows us to abolish the self
parameter.  The args all come in as a PyArgs variable, which you can
pick apart with the usual ParseTuple business.

 > 6) Error handling.  How to irradicate NULL checking in favor of C++
 >    style exceptions.

Finally, our C++ wrapper over the Python C API allows the abolition of
all NULL checking.  Instead, Python exceptions raised by Python C API
functions are converted directly into C++ exceptions, which begin
bouncing up the call stack until they are handled by a matching catch
clause.  If they are not caught at all by the C++ python extension
module, then they are trapped by the trampoline described in 5) which
called the member function in the first place.  At this point they are
returned to the python core just as they would've been if you'd done
all that hideous NULL checking in the first place.  Complete with
python traceback. 

The result of this is that you can now write very straightforward
python extension code in C++.  If you /want/ to do speculative calls,
in which you are prepared to test for and cope with the results of
exceptions in the Python C API, then you merely wrap such code in a
try/catch clause.  If you are /not/ willing to do such checking (the
usual case), then you can simply proceed along your merry way, writing
code, calling the Python C API (which you will problaby do throug the
wrapper objects), ignoring return codes, confident that any real
errors which result will propagate immediately back to the call site
and be returned as the familiar stack trace, just as they should.  We
feel this makes for much more lucid/robust coding.

Note that the trampoline not only catches the exceptions thrown by the
Python C API wrappers, but it will also catch /any/ C++ exception
which may percolate out of a C++ python extension module.  Thus, any
failure (if registered as a C++ exception) in C++ extension code, will
be converted to a python exception with accompanying stack trace.
This represents a dramatic improvement in the state of affairs of
Python steering of computational engines, from our perspective.

 > -------------------------------------------------------------------


To see how some of this fits together, here is the cxxtstmodule.cc
file which we are using to exercise these features.

//----------------------------------*-C++-*----------------------------------//
// cxxtstmodule.cc
//---------------------------------------------------------------------------//

#include <iostream.h>

#include "PythonX.hh"

using namespace Py;

static char cxxtst_alive__doc__[] =
"alive(): Verify that cxxtst is alive.";

static PyObject *
cxxtst_alive( PyObject *self, PyArgs *args )
{
    args->ParseTuple();

    cout << "Yes, the C++ extension binding test module is alive!\n" << flush;

    Py_INCREF(Py_None);
    return Py_None;
}

static char cxxtst_f1__doc__[] =
"f1(): Function taking one arg.";

static PyObject *
cxxtst_f1( PyObject *self, PyArgs *args )
{
    int i;
    args->ParseTuple( i );

    cout << "Yes, the user passed a single int argument: " << i << endl;

    Py_INCREF(Py_None);
    return Py_None;
}

static char cxxtst_toss_cookies__doc__[] =
"toss_cookies(): Function throwing a C++ exception.";

static PyObject *
cxxtst_toss_cookies( PyObject *self, PyArgs *args )
{
    cout << "toss_cookies is heaving.\n" << flush;

    throw "Oops, better clean up the mess!";

    Py_INCREF(Py_None);
    return Py_None;
}

//---------------------------------------------------------------------------//
// Now stuff for a type Foo.

class Foo : public PyObject
{
  public:
    Foo();
    PyObject *munge( PyArgs *args );
    PyObject *crash( PyArgs *args );
};

static void Foo_dealloc( Foo *pf );
static PyObject *Foo_getattr( Foo *pf, char *name );

statichere PyTypeObject Footype = {
    PyObject_HEAD_INIT(&PyType_Type)
    0,
    "Foo",
    sizeof(Foo),
    0,
    (destructor) Foo_dealloc,        /*tp_dealloc*/
    0,                               /*tp_print*/
    (getattrfunc) Foo_getattr,       /*tp_getattr*/
    0,                               /*tp_setattr*/
    0,                               /*tp_compare*/
    0,                               /*tp_repr*/
    0,                               /*tp_as_number*/
    0,                               /*tp_as_sequence*/
    0,                               /*tp_as_mapping*/
    0,                               /*tp_hash*/
};

#define is_Fooobject(op) ((op)->ob_type == &Footype)

Foo::Foo()
{
    ob_type = &Footype;
}

PyObject *Foo::munge( PyArgs *args )
{
    cout << "In Foo::munge\n" << flush;

    Py_INCREF( Py_None );
    return Py_None;
}

PyObject *Foo::crash( PyArgs *args )
{
    cout << "In Foo::crash\n" << flush;

    throw "flaming oblivion";

    Py_INCREF( Py_None );
    return Py_None;
}

X_MethodDef<Foo> XFoo_methods[] = {
    { "munge", Foo::munge, 1, "do some work" },
    { "crash", Foo::crash, 1, "die horribly" },
    { NULL, NULL, NULL, NULL }
};

//---------------------------------------------------------------------------//
// "Methods" of the C extension type.

static
void Foo_dealloc( Foo *pf )
{
    delete pf;
}

/* THE STANDARD GETATTR FOR NAMED METHODS */

static
PyObject *Foo_getattr( Foo *pf, char *name )
{
    return FindMemberFunc( XFoo_methods, pf, name );
}

PyObject *new_Foo( PyObject *self, PyArgs *args )
{
    return new Foo;
}

Instantiate_helpers(Foo)

//---------------------------------------------------------------------------//

static char cxxtst_module_documentation[] = 
"Python extension module for testing the C++ bindings for extensions.";

// List of methods defined in the module

static struct PyMethodDef cxxtst_methods[] = {

    {"alive", (PyCFunction) cxxtst_alive, 1, cxxtst_alive__doc__},
    {"f1",    (PyCFunction) cxxtst_f1,    1, cxxtst_f1__doc__},
    {"toss_cookies",    (PyCFunction) cxxtst_toss_cookies,    1, cxxtst_toss_cookies__doc__},

    {"new_Foo",    (PyCFunction) new_Foo,    1, },

    {NULL,		NULL, NULL, NULL}		/* sentinel */
};

extern "C"
void initcxxtst()
{
// Create the module and add the functions.

    Module *m = XInitmodule4( "cxxtst", cxxtst_methods,
			      cxxtst_module_documentation,
			      NULL, PYTHON_API_VERSION );

// Check for errors

    if (PyErr_Occurred())
	Py_FatalError("can't initialize module cxxtst");
}

//---------------------------------------------------------------------------//
//                              end of cxxtstmodule.cc
//---------------------------------------------------------------------------//

Here is what I get when I try various possible operations with it:

mousey[3] ./python 
>>> import cxxtst
>>> cxxtst.alive()
Yes, the C++ extension binding test module is alive!
>>> cxxtst.alive(-1)
Traceback (innermost last):
  File "<stdin>", line 1, in ?
TypeError: function requires exactly 0 arguments; 1 given
>>> f = cxxtst.new_Foo()
>>> f.munge()
In Foo::munge
>>> f.crash()
In Foo::crash
Traceback (innermost last):
  File "<stdin>", line 1, in ?
RuntimeError: a C++ exception has occurred: flaming oblivion
>>> 

To understand how this works, you need to see the invocation
trampolines.  The one for normal C++ global functions is:

static
PyObject *cxx_trampoline_va( cxxmethodobject *self, PyObject *args )
{
    try {
	return (*self->m_ml->ml_meth)( self->m_self, args );
    }
    catch( PyException& pyx ) {
    // This happens when the Python C API has already set an exception
    // condition, which is indicated by a return value of NULL.
	return NULL;
    }
    catch( const char *msg ) {
	char buf[ 1024 ];
	sprintf( buf, "a C++ exception has occurred: %s", msg );
	PyErr_SetString( PyExc_RuntimeError, buf );
	return NULL;
    }
    catch(...) {
	PyErr_SetString( PyExc_RuntimeError,
			 "an unrecognized C++ exception has occurred." );
	return NULL;
    }
}

And the one for C++ member functions is:

namespace Py {
PyObject *cxx_trampoline_mbrfnc_va( cxxmemberfuncobject *self, PyObject *args )
{
    PyObject *s = self->m_self;
    PyArgs *arglist = static_cast<PyArgs *>( args );
    try {
	return (s->*(self->m_ml->ml_meth))( arglist );
    }
    catch( PyException& pyx ) {
    // This happens when the Python C API has already set an exception
    // condition, which is indicated by a return value of NULL.
	return NULL;
    }
    catch( const char *msg ) {
	char buf[ 1024 ];
	sprintf( buf, "a C++ exception has occurred: %s", msg );
	PyErr_SetString( PyExc_RuntimeError, buf );
	return NULL;
    }
    catch(...) {
	PyErr_SetString( PyExc_RuntimeError,
			 "an unrecognized C++ exception has occurred." );
	return NULL;
    }
}

You can compare these to the functions in the Python core to see the
correspondence.  Basically we just use a try/catch wrapper clause, and
provide special handling for method invocation using a method pointer.

Another thing worth calling attention to is the code:

X_MethodDef<Foo> XFoo_methods[] = {
    { "munge", Foo::munge, 1, "do some work" },
    { "crash", Foo::crash, 1, "die horribly" },
    { NULL, NULL, NULL, NULL }
};

static
PyObject *Foo_getattr( Foo *pf, char *name )
{
    return FindMemberFunc( XFoo_methods, pf, name );
}

This is the replacement for Py_FindMethod().  From this you can see
that FindMemberFunc is templated, so that it is possible to construct
the method search engine for each type of Python extension object,
which avoids ugly casts.  

Final comments.

By this point, people should be getting the pretty clear image that
these extensions exercise the C++ language at a level very close to
the draft standard.  This is without apology.  We are coding under the
assumption that all of these are true:
	compiler has "real" template support
	compiler has member templates
	compiler has exceptions
	compiler has rtti
	compiler has namespaces

This is the C++ language as we know it, and we are not buckling under
on making a quality interface on account of compiler vendors who don't
support the language as the C++ committee defines it.  This may well
mean that certain C++ compilers in the market place will not compile
this code.  If you have such a compiler, you will need to wait until
your compiler is upgraded to a level conformant with the language
spec, before you will be able to use this.

Availability of the LLNL C++ patch set:
The code described in this note will be available to others.  We have
not made clear release plans/schedules yet.  It would be useful to
get a reading on how much interest there is in this.  Certainly it is
an ongoing development effort, with all the cautionary statements
about instability that go along with that.  Nevertheless, we are
highly motivated to get it into a stable form for our own purposes,
and are prepared to cut occasional patch sets for external use.

Cheers to all.  Let's hear what others are doing.

- -- 
Geoffrey Furnish		email: furnish at llnl.gov
LLNL X/ICF			phone: 510-424-4227	fax: 510-423-0925

_______________
C++-SIG - SIG for Development of a C++ Binding to Python

send messages to: c++-sig at python.org
administrivia to: c++-sig-request at python.org
_______________



More information about the Cplusplus-sig mailing list