[Numpy-discussion] Exported symbols and code reorganization.

Charles R Harris charlesr.harris at gmail.com
Wed Jan 10 18:24:38 EST 2007


On 1/10/07, David M. Cooke <cookedm at physics.mcmaster.ca> wrote:
>
> On Jan 10, 2007, at 13:52 , Charles R Harris wrote:
> > On 1/10/07, David M. Cooke <cookedm at physics.mcmaster.ca> wrote: On
> > Jan 7, 2007, at 00:16 , Charles R Harris wrote:
> > >
> > > That brings up the main question I have about how to break up the C
> > > files. I note that most of the functions in multiarraymodule.c, for
> > > instance, are part of the C-API, and are tagged as belonging to
> > > either the MULTIARRAY_API or the OBJECT_API. Apparently the build
> > > system scans for these tags and extracts the files somewhere. So,
> > > what is this API, is it available somewhere or is the code just
> > > copied somewhere convenient. As to breaking up the files, the scan
> > > only covers the code in the two current files, included code from
> > > broken out parts is not seen. This strikes me as a bit of a kludge,
> > > but I am sure there is a reason for it. Anyway, I assume the build
> > > system can be fixed, so that brings up the question of how to break
> > > up the files. The maximal strategy is to make every API functions,
> > > with it's helper functions, a separate file. This adds a *lot* of
> > > files, but it is straight forward and modular. A less drastic
> > > approach is to start by breaking multiarraymodule into four files:
> > > the converters, the two apis, and the module functions. My own
> > > preference is for the bunch of files, but I suspect some will
> > object.
> >
> > The code for pulling out the ``MULTIARRAY_API`` and ``OBJECT_API``
> > (also ``UFUNC_API``) is in ``numpy/core/code_generators``. Taking
> > ``MULTIARRAY_API`` as an example, the ``generate_array_api.py`` is
> > run by the ``numpy/core/setup.py`` file to generate the multiarray
> > (and object) API. The file ``numpy/core/code_generators/
> > array_api_order.txt`` is the order in which the API functions are
> > added to the  ``PyArray_API`` array; this is our guarantee that the
> > binary API doesn't change when functions are added. The files scanned
> > are listed ``in numpy/core/code_generators/genapi.py``, which is also
> > the module that does the heavy lifting in extracting the tagged
> > functions.
> >
> > Looked to me like the order could change without causing problems.
> > The include file was also written by the code generator and for
> > extension modules was just a bunch of macros assigning the proper
> > function pointer to the correct name. That brings up another bit,
> > however. At some point I would like to break the include file into
> > two parts, one for inclusion in the other numpy modules and another
> > for inclusion in extension modules, the big #ifdef in the current
> > file offends my sense of esthetics. It should also be possible to
> > attach the function pointers to real function prototype like
> > declarations, which would help extension modules check the code at
> > compile time.
>
> No, the order is necessary for binary compatibility. If PyArray_API
> [3] points to function 'A', and PyArray_API[4] points to function
> 'B', then, if A and B are reversed in a newer version, any extension
> module compiled with the previous version will now call function 'B'
> instead of 'A', and vice versa. Adding functions to the end is ok,
> though.


Well, that is true if we don't want extension modules to need recompiling
against new releases. However, I would like to replace the macros which are
compiled to offsets in the API with actual functions, something like

Py_Object*  (*apifunc)(Py_Object *, blah, blah);

Then the initialization function, instead of just making a copy of the API
table pointer, could actually initialize the functions, i.e.,

apifunct = c_api[10];

and there would be something like a function prototype. I would also like to
replace the generated include file for numpy modules with an actual
permanent include so that the interface would be specified by function
prototypes rather than generated from function definitions, but I suppose
the current method isn't all that different. I have a question about the
_import_array function

_import_array(void)
{
  PyObject *numpy = PyImport_ImportModule("numpy.core.multiarray");
  PyObject *c_api = NULL;
  if (numpy == NULL) return -1;
  c_api = PyObject_GetAttrString(numpy, "_ARRAY_API");
  if (c_api == NULL) {Py_DECREF(numpy); return -1;}
  if (PyCObject_Check(c_api)) {
      PyArray_API = (void **)PyCObject_AsVoidPtr(c_api);
  }
  Py_DECREF(c_api);
  Py_DECREF(numpy);

Is there a reason that the pointer PyArray_API remains valid if the imported
numpy module is garbage collected? Or is the imported module persistent even
though the returned object goes away? I assume that python did a dlopen
somewhere.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070110/c96cd5ea/attachment.html>


More information about the NumPy-Discussion mailing list