Barking up the wrong tree with extension types and modules?

Robert Kern robert.kern at
Thu Mar 23 21:43:15 CET 2006

Steve Juranich wrote:
> We have a large suite of "legacy" tools (based on stand-alone executables
> and shell scripts) that we would like to integrate into a suite of Python
> modules.  I'm having a little trouble getting my head around how the best
> way to implement the whole thing, though.
> Here's (roughly) the situation:
> We have TypeA which is used *everywhere* in the tools.  I have completed a
> wrapper for TypeA and can now create Python versions of the TypeA object. 
> I'd now like to write wrappers for ToolSet1 and ToolSet2 which make
> extensive use of TypeA.
> I'd also like the resulting C code to be maintainable.  I know that I can
> write bighonkingmodule.c that incorporates the wrappers for TypeA,
> ToolSet1, and ToolSet2, but this could easily lead to a maintenance
> nightmare.  I would like to have TypeA_wrap.c, ToolSet1_wrap.c and
> ToolSet2_wrap.c that would generate 3 separate modules:
>    : The compatibility layer for the TypeA object.
> : Compatibility layer for tools in ToolSet1
>                 (This ingests and generates TypeA objects).
> : Compatibility layer for tools in ToolSet2
>                 (This also ingests and generates TypeA objects).
> I also know that I could put a definition for TypeA in both ToolSet1 and
> ToolSet2, but then (as far as I understand), outputs from ToolSet1 wouldn't
> be readily available for processing in ToolSet2 because each module would
> have its own idea about what a TypeA actually is.

Correct. Type comparisons are usually done by pointer comparisons.

> Basically, the type of package structure I want is something like:
> OurTools/
>   ...
> Has anybody tackled anything like this before?  Are there any examples that
> I could look at?

Yes! numpy does something like this and, through its predecessor Numeric, has
been doing this for the past ten years or so.

The main array type and its API is defined in a single module,
numpy.core.multiarray. All of the relevant function pointers are collected in a
void** array, and turned into a PyCObject which is assigned to
numpy.core.multiarray._ARRAY_API . Extension modules using the numpy C API call
the function import_array() before using any of those functions. I've excerpted
appropriate segments from the header file below:

static void **PyArray_API=NULL;

#define PyArray_GetNDArrayCVersion (*(unsigned int (*)(void)) PyArray_API[0])
#define PyBigArray_Type (*(PyTypeObject *)PyArray_API[1])

static int
  PyObject *numpy = PyImport_ImportModule("numpy.core.multiarray");
  PyObject *c_api = NULL;
  if (numpy == NULL) return -1;
  c_api = PyObject_GetAttrString(numpy, "_ARRAY_API");
  if (c_api == NULL) {Py_DECREF(numpy); return -1;}
  if (PyCObject_Check(c_api)) {
      PyArray_API = (void **)PyCObject_AsVoidPtr(c_api);
  if (PyArray_API == NULL) return -1;
  /* Perform runtime check of C API version */
  if (NDARRAY_VERSION != PyArray_GetNDArrayCVersion()) {
    PyErr_Format(PyExc_RuntimeError, "module compiled against "\
        "version %x of C-API but this version of numpy is %x", \
        (int) NDARRAY_VERSION, (int) PyArray_GetNDArrayCVersion());
    return -1;
  return 0;

There are a bunch of #ifdef statements that control these blocks, so I recommend
looking at the numpy source itself. Unfortunately, we use a lot of code
generation to manage all of this, so you will have to build numpy once to
generate the code. These excerpts came from
build/src/numpy/core/__multiarray_api.h .

If you have any more questions about this approach, we would be happy to answer
them on numpy-discussion at .

Robert Kern
robert.kern at

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

More information about the Python-list mailing list