[C++-sig] Need strategic advice designing Boost/Python program
Ralf W. Grosse-Kunstleve
rwgk at yahoo.com
Mon Apr 30 23:28:42 CEST 2007
> Also, thank you for your explanation of <extract>. It seems like I
> will still need to use extract, because I want the Python interface to
> use arrays in a Python-like manner. On Ralf's suggestion, I will look
> at scitbx to see how he handles it.
It all centers on the idea of storing repetetive data in C++ reference-counted
arrays, which are wrapped with a central facility (huge template, flex_wrapper.h>).
Custom data are stored in specific C++ classes which are wrapped with Boost.Python.
A simple example is scitbx/histogram.h, where the repetetive data are in
an array of floating point values, and the custom data in the histogram class.
The interesting point after a few years of working experience is:
Most people are covered with the existing array types
(element types double, size_t, std::complex<double>, etc.). It is fairly rare that
someone needs other array types. I'm also trying to keep the number of wrapped array
types low because the compile-time overhead and resulting .so file sizes
are quite significant. This consideration leads to designs where we have
two independent arrays of type A and B, instead of one array of a type C
which composes A and B. That's often a compromise, but, again working
experience, most of the time it really doesn't matter all that much, and you
can hide it, e.g. behind a thin Python layer. If it becomes a problem, you
can still use that huge template, and in a couple of lines define a new array type
with element type C, with a complete Python interface modeled
after Python's builtin list.
So when I approach a new problem I typically try to fit it into the system
by designing a C++ class for the core calculation, reusing our existing
Python-exposed C++ arrays (flex arrays) as inputs. If this solution is too raw
for general consumption, I stick a Python layer on top that makes it look truly
pythonic.
It is very similar to the idea behind Numeric/Numpy/numarray, only that the
scitbx array types are C++ arrays with a "nice" std::vector like interface and
fully automatic life-time management, and that you can easily make arrays of
user-defined types if you feel it is necessary.
Not everything fits into this scheme of using C++ as the "vector unit" and Python
as the slower but more versatile "scalar unit", but we got a long way with this
approach. The only time it didn't work out for me was when I had to handle a
very deeply nested hierarchical data structure (six levels!), with two-way
parent-child references. But numerical data usually come as arrays anyway,
so often it is a no-brainer to design the C++ class processing the data.
> Also--Ralf, thanks for the information about scitbx, and I may have
> more questions once I've studied it. I'll take a look at it and try to
> learn from its design, but currently the server
> http://cci.lbl.gov/ seems to be unreachable.
Sorry, that was an accident. It is back online now. (The unusual instability
is because we just moved to new hardware.)
Ralf
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cplusplus-sig/attachments/20070430/8382cd6a/attachment.htm>
More information about the Cplusplus-sig
mailing list