[Numpy-discussion] How a transition to C++ could work

Matthew Brett matthew.brett at gmail.com
Sun Feb 19 03:32:40 EST 2012


Hi,

Thanks for this - it's very helpful.

On Sat, Feb 18, 2012 at 11:18 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> The suggestion of transitioning the NumPy core code from C to C++ has
> sparked a vigorous debate, and I thought I'd start a new thread to give my
> perspective on some of the issues raised, and describe how such a transition
> could occur.
>
> First, I'd like to reiterate the gcc rationale for their choice to switch:
> http://gcc.gnu.org/wiki/gcc-in-cxx#Rationale
>
> In particular, these points deserve emphasis:
>
> The C subset of C++ is just as efficient as C.
> C++ supports cleaner code in several significant cases.
> C++ makes it easier to write cleaner interfaces by making it harder to break
> interface boundaries.
> C++ never requires uglier code.
>
> Some people have pointed out that the Python templating preprocessor used in
> NumPy is suggestive of C++ templates. A nice advantage of using C++
> templates instead of this preprocessor is that third party tools to improve
> software quality, like static analysis tools, will be able to run directly
> on the NumPy source code. Additionally, IDEs like XCode and Visual C++ will
> be able to provide the full suite of tab-completion/intellisense features
> that programmers working in those environments are accustomed to.
>
> There are concerns about ABI/API interoperability and interactions with C++
> exceptions. I've dealt with these types of issues on enough platforms to
> know that while they're important, they're a lot easier to handle than the
> issues with Fortran, BLAS, and LAPACK in SciPy. My experience has been that
> providing a C API from a C++ library is no harder than providing a C API
> from a C library.
>
> It's worth comparing the possibility of C++ versus the possibility of other
> languages, and the ones that have been suggested for consideration are D,
> Cython, Rust, Fortran 2003, Go, RPython, C# and Java. The target language
> has to interact naturally with the CPython API. It needs to provide direct
> access to all the various sizes of signed int, unsigned int, and float. It
> needs to have mature compiler support wherever we want to deploy NumPy.
> Taken together, these requirements eliminate a majority of these
> possibilities. From these criteria, the only languages which seem to have a
> clear possibility for the implementation of Numpy are C, C++, and D.

On which criteria did you eliminate Cython?

> The biggest question for any of these possibilities is how do you get the
> code from its current state to a state which fully utilizes the target
> language. C++, being nearly a superset of C, offers a strategy to gradually
> absorb C++ features. Any of the other language choices requires a rewrite,
> which would be quite disruptive. Because of all these reasons taken
> together, I believe the only realistic language to use, other than sticking
> with C, is C++.
>
> Finally, here's what I think is the best strategy for transitioning to C++.
> First, let's consider what we do if 1.7 becomes an LTS release.
>
> 1) Immediately after branching for 1.7, we minimally patch all the .c files
> so that they can build with a C++ compiler and with a C compiler at the same
> time. Then we rename all .c -> .cpp, and update the build systems for C++.
> 2) During the 1.8 development cycle, we heavily restrict C++ feature usage.
> But, where a feature implementation would be arguably easier and less
> error-prone with C++, we allow it. This is a period for learning about C++
> and how it can benefit NumPy.
> 3) After the 1.8 release, the community will have developed more experience
> with C++, and will be in a better position to discuss a way forward.
>
> If, for some reason, a 1.7 LTS is unacceptable, it might be a good idea to
> restrict the 1.8 release to the subset of both C and C++. I would much
> prefer using the 1.8 development cycle to dip our toes into the C++ world to
> get some of the low-hanging benefits without doing anything disruptive.
>
> A really important point to emphasize is that C++ allows for a strategy
> where we gradually evolve the codebase to better incorporate its language
> features. This is what I'm advocating. No massive rewrite, no disruptive
> changes. Gradual code evolution, with ABI and API compatibility comparable
> to what we've delivered in 1.6 and the upcoming 1.7 releases.

Do you have any comment on the need for coding standards when using
C++?  I saw the warning in:

http://gcc.gnu.org/wiki/gcc-in-cxx#Rationale

about using C++ unwisely.

See you,

Matthew



More information about the NumPy-Discussion mailing list