[Numpy-discussion] How a transition to C++ could work

Mark Wiebe mwwiebe at gmail.com
Sun Feb 19 03:49:49 EST 2012


On Sun, Feb 19, 2012 at 2:32 AM, Matthew Brett <matthew.brett at gmail.com>wrote:

> Hi,
>
> Thanks for this - it's very helpful.
>
> On Sat, Feb 18, 2012 at 11:18 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> > The suggestion of transitioning the NumPy core code from C to C++ has
> > sparked a vigorous debate, and I thought I'd start a new thread to give
> my
> > perspective on some of the issues raised, and describe how such a
> transition
> > could occur.
> >
> > First, I'd like to reiterate the gcc rationale for their choice to
> switch:
> > http://gcc.gnu.org/wiki/gcc-in-cxx#Rationale
> >
> > In particular, these points deserve emphasis:
> >
> > The C subset of C++ is just as efficient as C.
> > C++ supports cleaner code in several significant cases.
> > C++ makes it easier to write cleaner interfaces by making it harder to
> break
> > interface boundaries.
> > C++ never requires uglier code.
> >
> > Some people have pointed out that the Python templating preprocessor
> used in
> > NumPy is suggestive of C++ templates. A nice advantage of using C++
> > templates instead of this preprocessor is that third party tools to
> improve
> > software quality, like static analysis tools, will be able to run
> directly
> > on the NumPy source code. Additionally, IDEs like XCode and Visual C++
> will
> > be able to provide the full suite of tab-completion/intellisense features
> > that programmers working in those environments are accustomed to.
> >
> > There are concerns about ABI/API interoperability and interactions with
> C++
> > exceptions. I've dealt with these types of issues on enough platforms to
> > know that while they're important, they're a lot easier to handle than
> the
> > issues with Fortran, BLAS, and LAPACK in SciPy. My experience has been
> that
> > providing a C API from a C++ library is no harder than providing a C API
> > from a C library.
> >
> > It's worth comparing the possibility of C++ versus the possibility of
> other
> > languages, and the ones that have been suggested for consideration are D,
> > Cython, Rust, Fortran 2003, Go, RPython, C# and Java. The target language
> > has to interact naturally with the CPython API. It needs to provide
> direct
> > access to all the various sizes of signed int, unsigned int, and float.
> It
> > needs to have mature compiler support wherever we want to deploy NumPy.
> > Taken together, these requirements eliminate a majority of these
> > possibilities. From these criteria, the only languages which seem to
> have a
> > clear possibility for the implementation of Numpy are C, C++, and D.
>
> On which criteria did you eliminate Cython?


The "mature compiler support" one. As glue between C/C++ and Python, it
looks great, but Dag's evaluation of Cython's maturity for implementing the
style of functionality in NumPy seems pretty authoritative. So people don't
have to dig through the giant email thread, here's the specific message
content from Dag, and it's context:

On 02/18/2012 12:35 PM, Charles R Harris wrote:
>
> No one in their right mind would build a large performance library using
> Cython, it just isn't the right tool. For what it was designed for -
> wrapping existing c code or writing small and simple things close to
> Python - it does very well, but it was never designed for making core
> C/C++ libraries and in that role it just gets in the way.

+1. Even I who have contributed to Cython realize this; last autumn I
implemented a library by writing it in C and wrapping it in Cython.



> > The biggest question for any of these possibilities is how do you get the
> > code from its current state to a state which fully utilizes the target
> > language. C++, being nearly a superset of C, offers a strategy to
> gradually
> > absorb C++ features. Any of the other language choices requires a
> rewrite,
> > which would be quite disruptive. Because of all these reasons taken
> > together, I believe the only realistic language to use, other than
> sticking
> > with C, is C++.
> >
> > Finally, here's what I think is the best strategy for transitioning to
> C++.
> > First, let's consider what we do if 1.7 becomes an LTS release.
> >
> > 1) Immediately after branching for 1.7, we minimally patch all the .c
> files
> > so that they can build with a C++ compiler and with a C compiler at the
> same
> > time. Then we rename all .c -> .cpp, and update the build systems for
> C++.
> > 2) During the 1.8 development cycle, we heavily restrict C++ feature
> usage.
> > But, where a feature implementation would be arguably easier and less
> > error-prone with C++, we allow it. This is a period for learning about
> C++
> > and how it can benefit NumPy.
> > 3) After the 1.8 release, the community will have developed more
> experience
> > with C++, and will be in a better position to discuss a way forward.
> >
> > If, for some reason, a 1.7 LTS is unacceptable, it might be a good idea
> to
> > restrict the 1.8 release to the subset of both C and C++. I would much
> > prefer using the 1.8 development cycle to dip our toes into the C++
> world to
> > get some of the low-hanging benefits without doing anything disruptive.
> >
> > A really important point to emphasize is that C++ allows for a strategy
> > where we gradually evolve the codebase to better incorporate its language
> > features. This is what I'm advocating. No massive rewrite, no disruptive
> > changes. Gradual code evolution, with ABI and API compatibility
> comparable
> > to what we've delivered in 1.6 and the upcoming 1.7 releases.
>
> Do you have any comment on the need for coding standards when using
> C++?  I saw the warning in:
>
> http://gcc.gnu.org/wiki/gcc-in-cxx#Rationale
>
> about using C++ unwisely.
>

Yes, coding standards are very important. I think they are important for C
as well, and it's a problem that NumPy hasn't had any standards written
down yet. Chuck is presently the most rigorous enforcer of standards within
the current C codebase, so I would nominate him to take a first pass at
writing them down. The same applies to Python, and that's what PEP 8 is for.

Cheers,
Mark


>
> See you,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120219/b4d78942/attachment.html>


More information about the NumPy-Discussion mailing list