Design questions for C++ support for Python extensions (cppy)

Mon Jul 12 04:06:59 EDT 2010

Hi.

With the current cppy code the Python 3.1.1 doc's spam example extension module 
looks like this (actual working code):

<code>
     #include <progrock/cppx/devsupport/better_experience.h>
     #include <progrock/cppy/Module.h>
     using namespace progrock;

     namespace {

         class Spam: public cppy::Module
         {
         public:
             Spam(): cppy::Module( L"spam", L"blåbærsyltetøy er blått" )
             {}

             PyObject* system( PyObject* args )
             {
                 const char *command;
                 if( !PyArg_ParseTuple( args, "s", &command ) )
                 {
                     return NULL;
                 }
                 int const sts = ::system( command );
                 return Py_BuildValue( "i", sts );
             }
         };

     }    // namespace <anon>

     CPPY_MODULE_CROUTINE( Spam, system, L"Execute a shell command" )

     PyMODINIT_FUNC PyInit_spam()
     {
         return cppy::safeInit< Spam >();
     }
</code>

Issues:

   1. Wide string literals OK?
      The basic Python API often requires UTF-8 encoded byte strings. With C++
      source code encoded using e.g. Windows ANSI Western, string literals with
      national characters such as Norwegian ÆØÅ then become gobbledegook or cause
      outright failure. I balanced the hypothetical need for string literals with
      national characters, versus perceived unnaturalness of wide string literals
      for *nix developers, in favor of the former, i.e. L"wide string literals".

      Related issue here: in Windows MinGW g++ can not compile utf-8 encoded
      source with BOM, while MSVC requires a BOM in order to detect the encoding.

      Is L"this" an acceptable decision if you were to use something like cppy?

   2. Exception translation OK?
      The code within the 'system' member routine could conceivably be reduced to
      a single short line by adding some C++ support, but that would require use
      of exceptions to report errors. Translating C++ exceptions to Python
      exceptions does however add some overhead to every Python -> C++ call.
      Currently I do this only for the module initialization code, but should it
      be done for every exported method? Or perhaps as user choice? Benefit of
      translation e.g. reducing 'system' above to sweet single line. Cost is
      setup of try-block (negligible) and exception translation (inefficient).

   3. Some unsafety OK?
      In 'PyInit_spam' there may be a window of opportunity for client code to
      Mess Things Up. Within the cppy::safeInit the C++ module object is created
      and creates a Python module object, and if anything fails then the C++ side
      frees the Python object. And after PyInit_spam has returned to Python the
      cleanup responsibility resides with the Python interpreter: freeing the
      Python module object causes the C++ object to be destroyed. But if say the
      client code's equivalent of 'PyInit_spam' calls cppy::safeInit and just
      discards the result and returns 0 (say) to Python, then it seems to not be
      documented whether Python will free the Python module, i.e. possible leak.

   4. Threading?
      Is it necessary to make singletons/statics thread safe? Or does Python
      ensure that no other threads execute while PyInit_spam is called? Can it
      be called simultaneously by two or more threads?

   5. Reload of interpreter?
      My impression from the documentation is that finalization and reinit of the
      interpreter is something that an application shouldn't really do, and that
      an extension developer need not worry about that happening. Is it so?

Cheers,

- Alf

-- 
blog at <url: http://alfps.wordpress.com>