[C++-sig] Weave Scipy Inline C++

Sun Sep 15 16:57:49 CEST 2002

From: "eric jones" <eric at enthought.com>

> >   object f(boost::python::object x)
> >   {
> >     x[1] = "hello";
> >     return x(1,2,3,4,5,6,x); // __call__
> >     ...
>
> This is definitely visually cleaner, and I like it better.  Maybe a few
> overloads in SCXX would make
>
> x[1] = "hello";
>
> work though for ints, floats, strings, etc.  I'll look at this and
> report back...  Yep, worked fine.
>
> Also, I understand the first line, but not what the second is doing.  Is
> x like a UserList that has a __call__ method defined?

Hypothetically, yes.

> Note also that boost has more work to do in this case than weave does.
> Boost::python::object can be pretty much anything I'm guessing.  When we
> get to the 'x[1] = "hello";' line in weave, the code is explicitly
> compiled again each time for a list or a tuple or a dict.

That sounds like a lot more work than what Boost.Python is doing!

> The following
> happens at the command line:
>
> >>> a = [1]
> >>> weave.inline('a[0] = "hello";',['a'])
> <compile a function that takes 'a' as a list>
> >>> a[0]
> "hello"
> >>> a = {}
> >>> weave.inline('a[0] = "hello";',['a'])
> <compile a function that takes 'a' as a dict>
>
> So I'm guessing the cleverness you probably had to go through to get
> things working in C++ is handled by the dynamic typing mechanism in
> weave.

Wow, that could result in a huge amount of bloat. What if you /want/ some
runtime polymorphism in your C++ code?
Furthermore, without special knowledge of the internals of every type that
might be passed, you can't generate code that's any more-optimized for T1
than for T2:

    >>> class T1(dict): pass
    >>> class T2(dict): pass
    >>> a = T1()
    >>> weave.inline('a[0] = "hello";', ['a'])
    >>> a = T2()
    >>> weave.inline('a[0] = "hello";', ['a'])

We can generate any number of distinct types for which x[0] = "hello" is a
valid expression...

> >
> > not
> >
> >     x[Py::Int(1)] = Py::Str("hello");
> >     // ??? what does __call__ look like?
>
> Currently I just use the Python API for calling functions -- although
> SCXX does have a callable class that could be used.  Also, nothing
> special is done to convert instances that flow from Python into C++
> unless a special converter has been written for them (such as for
> wxPython objects).  Things weave doesn't explicitly know about are left
> as PyObject* which can be manipulated in C code.

Boost.Python is designed with the idea in mind that users never touch a
PyObject*.

> > or whatever. Getting this code to work everywhere was one of the
> harder
> > porting jobs I've ever faced. Compilers seem to have lots of bugs in
> the
> > areas I was exercising.
>
> The porting comment scares me.  Is it still ticklish?

Not too ticklish with modern compilers (though one recent release had a
codegen bug this stimulated). The big problem is that there are lots of old
compilers out there that people still use. VC6, for example.

> C++ bugs pop in
> areas where they shouldn't -- even in the same compiler but on separate
> platform.  There is currently some very silly code in weave explicitly
> to work around exception handling bugs on Mandrake with gcc.  Since
> spending a couple of days on this single issue, I've tried to avoid
> anything potentially fragile (hence the move to SCXX).  CXX compile
> issues also pushed me that direction.  The compilers I really care about
> are gcc 2.95.x (mingw and linux), gcc 3.x, MSVC, MIPSPro, SunCC, DEC,
> and xlc.  How is boost doing on this set?

Fine on gcc 2.95.x, 3.1, 3.2, msv6/7, MIPSPro, Dec CXX 6.5 and a whole
bunch of others.
I haven't tested xlc recently. I anticipate some issues with SunCC.

> Weave isn't tested all these
> places yet, but needs to run on all of them eventually (and shouldn't
> have a problem now).

No conforming code "should have a problem". But you know how in theory
there's no difference between theory and practice...

> > However, you may still be right that it's not an appropriate solution
> for
> > weave.
>
> I think boost would work fine -- maybe even better.  I really like that
> boost project is active -- SCXX and CXX aren't very.  The beauty of SCXX
> is it takes about 20 minutes to understand its entire code base.  The
> worries I have with boost are:
>
> 1) How much of boost do I have to carry around for the simple
> functionality I mentioned.

Boost.Python depends on quite a few of the other boost libraries:

    type_traits
    bind
    function
    mpl - currently in prerelease
    smart_ptr

possibly a few others. These are all in header files.

> 2) How persnickety is compiling the code on new platforms?

Fairly persnickety, unless you're a C++ generic/metaprogramming expert.

> 3) Would people have to understand much of boost to use this
> functionality?

They wouldn't have to understand much of Boost as a whole. They'd only need
to understand the components of Boost.Python that they're using. There is
also a template called extract<> which would be useful to know about.

> 4) How ugly are the error reports from the compiler when code is
> malformed?  Blitz++ reports are incomprehensible to anyone except
> template gurus.

You get a long instantiation backtrace as usual. However, we've applied
some tricks which cause error messages to contain a plain english
description of what the user did wrong in many cases.

> 5) How steep is my learning curve on the code? (I know, only I can
> answer that by looking at it for a while which I haven't yet.)

I have no idea how to answer that.

> Note that I'm really looking for the prettiest and fastest solution
> *with the least possible headaches*.  For weave, least headaches trumps
> pretty and fast in a major way.

Then use the Python "C" API.

> I've even considered moving weave back
> to generating pure C code to make sure it works everywhere and leaving
> the user to wrestle with refcounts and the Python API.  I think C++ is
> getting far enough along though that this shouldn't be necessary (and
> allows the "pretty").  Note though, that I was extremely disappointed
> with CXX's speed when manipulating lists, etc.  It was *much* slower
> than calling the raw Python API.  For computationally intense stuff on
> lists, etc., you had to revert to API calls.  I haven't benchmarked SCXX
> yet, but I'm betting the story is the same.  Most things I care about
> are in Numeric arrays, but that isn't true for everyone else.

Boost.Python's object wrappers are not generally designed for maximum
speed. For example, the list class knows that it might hold a class derived
from list, so it does PyList_CheckExact() before calling any PyList_
functions directly. If it's been subclassed it goes through the general
Python API. Also, none of the operators such as [] have been set up to use
the PyList_xxx functions in this way. It could be done; it's just a lot of
work.

> One other thought is that once we understand each others technologies
> better, we may see other places where synergy is beneficial.

Hopefully, yes.

> >
> > > If you need the other 99.7% of boost's capabilities, then you
> probably
> > > need to be using boost instead of weave anyhow.  They serve
> different
> > > purposes.  Weave is generally suited for light weight wrapping and
> > > speeding up computational kernels with minimum hassle -- especially
> in
> > > numeric codes where Numeric isn't fast enough.
> > >
> > > Oh, and I'm happy to except patches that allow for boost type
> converters
> ^^^^^^ err... accept :-|
>
> > > in weave (they should, after all, be easy to write).  Then you can
> use
> > > boost instead of SCXX.
> >
> > What did you have in mind?
>
> The code for a new type converter class that handles translating Python
> code to C++ is rather trivial after the latest factoring of weave.  Here
> is an example of a weave expression and the underlying C++ code that is
> generated on the fly:
>
> #python
> >>> a = {}
> >>> weave.inline('a["hello"] = 1;',['a'])
>
> # underlying ext func
> static PyObject* compiled_func(PyObject*self, PyObject* args)
> {
>     PyObject *return_val = NULL;
>     int exception_occured = 0;
>     PyObject *py__locals = NULL;
>     PyObject *py__globals = NULL;
>     PyObject *py_a;
>     py_a = NULL;
>
>
> if(!PyArg_ParseTuple(args,"OO:compiled_func",&py__locals,&py__globals))
>         return NULL;

Isn't there a way to check for dict/int args right here?^^^^^^^^^^^^^^^^

>     try
>     {
>         PyObject* raw_locals = py_to_raw_dict(py__locals,"_locals");
>         PyObject* raw_globals = py_to_raw_dict(py__globals,"_globals");
>         /* argument conversion code */
>         py_a = get_variable("a",raw_locals,raw_globals);
>         PWODict a = convert_to_dict(py_a,"a");
>         /* inline code */
>         a["hello"] = 1;
>     }
>     catch(...)
>     {
>         return_val =  NULL;
>         exception_occured = 1;
>     }
>     /* cleanup code */
>     if(!return_val && !exception_occured)
>     {
>         Py_INCREF(Py_None);
>         return_val = Py_None;
>     }
>     return return_val;
> }
>
> So the line that has to change is:
>
>         PWODict a = convert_to_dict(py_a,"a");
>
> and the convert_to_dict function -- but it is automatically generated by
> the converter class (although you could customize it if needed).

That would look something like this in Boost.Python:

    handle<> py_a = borrowed(get_variable("a", raw_locals, raw_globals));
    dict x = extract<dict>(object(py_a));

You could generate slightly more-efficient code using some of the
implementation details of Boost.Python, but I'd rather not expose those to
anyone.

-----------------------------------------------------------
           David Abrahams * Boost Consulting
dave at boost-consulting.com * http://www.boost-consulting.com