Add pybind11 to docs about writing binding code

Dear all, I just joined the numpy mailing list to suggest an enhancement of the docs about writing binding code https://docs.scipy.org/doc/numpy/user/c-info.python-as-glue.html <https://docs.scipy.org/doc/numpy/user/c-info.python-as-glue.html> I hope this is the right place to discuss this. So, this page is one of the first hits in Google when you search for "python bindings numpy". It is an important page for orientation. What I miss is a longer article about pybind11. http://pybind11.readthedocs.io/en/stable/ <http://pybind11.readthedocs.io/en/stable/> https://github.com/pybind/pybind11 <https://github.com/pybind/pybind11> pybind11 is currently the best tool on the market to wrap C++ code to Python. This is my professional opinion. When you look at the facts, it is hard to disagree. Pybind11 is based on the approach of Boost.Python, but is a compact project that doesn't require Boost and is developed independently. If you use it in a Python package, you can add it as a requirement and pip will happily install it. It doesn't require you to learn a new language, the bindings are generated using C++ meta-programming techniques under the hood. pybind11 has outstanding documentation and is extremely popular on github (3800+ stars). Refcounting is done automatically, which is why I would even use it to wrap C code. Naturally, it has excellent support for numpy arrays. Pybind11 is FOSS (BSD-style license). I have used Cython, Boost.Python, SWIG, and pybind11 in small to large projects, and pybind11 is by far the most pleasant and the most powerful. You can do really sophisticated things in pybind11, which I cannot imagine doing with other binding tools, and most importantly, it never chokes over your C++ code. Cython and SWIG both have trouble with certain C++ idioms, which is not surprising because C++ is notoriously difficult to parse and these tools were primarily developed to wrap C (which is much easier to parse). For C++, it is much better to not add a custom parser to the toolchain and just let the C++ compiler generate the low-level binding code. This is what pybind11 does. So far all these reasons and more, it should be mentioned and even highlighted here: https://docs.scipy.org/doc/numpy/user/c-info.python-as-glue.html <https://docs.scipy.org/doc/numpy/user/c-info.python-as-glue.html> I am happy to write a section about it. Disclaimer: I am not at all affiliated with the pybind11 developers, just a thankful user. Best regards, Hans

Hi, ke, 2018-08-15 kello 11:40 +0200, Hans Dembinski kirjoitti: [clip: pybind11]
Please go ahead --- the relevant source file is here: https://github.com/numpy/numpy/blob/master/doc/source/user/c-info.python-as-... It also does no mention CFFI either, mostly because the original text predates the project. -- Pauli Virtanen

Hi Hans, Pauli, If `pybind11` is included, it could be interesting to also include `xtensor` and `xtensor-python`. - Xtensor is a C++ dynamic N-d array library that offers numpy-like features including broadcasting and universal functions. It is also lazy evaluated and continuously benchmarked against numpy, eigen, pythran and numba. You can check out the numpy to xtensor cheat sheet: https://xtensor.readthedocs.io/en/latest/numpy.html. - Xtensor-python makes it possible to operate on numpy arrays inplace using the xtensor API. So that e.g. an xtensor reshape will result in a reshape on the python side (using the numpy C API under the hood). Xtensor-python is built upon pybind11, but brings it much closer to feature parity with NumPy. There is a vibrant community of users and developers, actively working to make xtensor faster and cover more of numpy APIs. I would argue that xtensor-python is one of the easiest ways to make use of numpy arrays from a C++ program, given the similar high level API, and tools to make ufuncs and bindings with one-liners. Resources: - xtensor: https://github.com/QuantStack/xtensor (documentation: https://xtensor.readthedocs.io/) - xtensor-python: https://github.com/QuantStack/xtensor-python (documentation: https://xtensor-python.readthedocs.io/) - xtensor-blas: https://github.com/QuantStack/xtensor-blas (documentation: https://xtensor-blas.readthedocs.io) - xtensor-io: https://github.com/QuantStack/xtensor-io (documentation: https://xtensor-io.readthedocs.io) for reading and writing various file formats Other language bindings: - xtensor-julia: https://github.com/QuantStack/xtensor-julia (documentation: https://xtensor-julia.readthedocs.io/en/latest/) - xtensor-r: https://github.com/QuantStack/xtensor-r (documentation: https://xtensor-r.readthedocs.io/en/latest/) Best, Sylvain On Wed, Aug 15, 2018 at 12:50 PM, Pauli Virtanen <pav@iki.fi> wrote:

Hi Sylvain,
sounds good, I think it should be mentioned in the pybind11 part. I just stumbled over xtensor yesterday. Based on your post I read a bit more about it. I like the expression engine and lazy evaluation, the concept is similar to Eigen. xtensor itself has nothing to do with binding, but makes working with numpy arrays on the C++ side easier - especially when you are familiar with the numpy API. The docs say: "Xtensor operations are continuously benchmarked, and are significantly improved at each new version. Current performances on statically dimensioned tensors match those of the Eigen library. Dynamically dimension tensors for which the shape is heap allocated come at a small additional cost." I couldn't find these benchmark results online, though, could you point me to the right page? Google only produced an outdated SO post where numpy performed better than xtensor. Best regards, Hans PS: A bit of nitpicking: you use the term "tensor" for an n-dimensional block of numbers - a generalisation of "matrix", but the term "tensor" in mathematics and physics is more specific. A tensor has well-defined transformation properties when you change the basis of your vector space, just like a "vector" (a vector is a one-dimensional tensor), while a general block of numbers does not. https://en.wikipedia.org/wiki/Tensor

Hi Hans, On Thu, Aug 16, 2018 at 10:51 AM, Hans Dembinski <hans.dembinski@gmail.com> wrote:
Actually, xtensor-python does a lot more in terms of numpy bindings, as it uses the C APIs of numpy directly for a number of things. Plus, the integration into the xtensor expression system lets you do things such as view / broadcasting / newaxis / ufuncs directly from the C++ side (and all that is in the cheat sheets).
That is because we run the benchmarks on our own hardware. Since xtensor is explicitly SIMD accelerated for a variety of architectures including e.g. avx512, it is hard to have a consistent environment to run the benchmarks. We have a I9 machine that runs the benchmarks with various options, and manually run them on raspberry pis for the neon acceleration benchmarks (continuous testing of neon instruction sets are tested with an emulator on travisci in the xsimd project). Cheers, Sylvain
it is clearly a very overloaded term.

Hi Sylvain,
ok, good, but my point was different. The page in question is about Python as a glue language. The other solutions on that site are general purpose binding solutions for any kind of C++ code, while xtensor-python is xtensor-specific. xtensor in turn is a library that mimics the numpy API in C++.
Ok, but you can still put the results for everyone to see and judge by themselves on a web page. Just state on what kind of machine you ran the code. It is ok if the results on my machine differ, I am still interested in the results that you get on your machines and since you generate them anyway, I don't see why not.
[tensor] is clearly a very overloaded term.
I agree that vector is a very overloaded term (the STL vector is particularly to blame). But until recently, tensor used to be a well-defined technical term which exclusively referred to a specific mathematical concept https://en.wikipedia.org/wiki/Tensor_(disambiguation) https://medium.com/@quantumsteinke/whats-the-difference-between-a-matrix-and... <https://medium.com/@quantumsteinke/whats-the-difference-between-a-matrix-and...> https://www.quora.com/What-is-a-tensor <https://www.quora.com/What-is-a-tensor> Then Google started to popularise the term wrongly in Tensorflow and now another well-defined technical term gets watered down. xtensor is going with the tensorflow, sad. Best regards, Hans

On Fri, Aug 17, 2018 at 2:00 AM Hans Dembinski <hans.dembinski@gmail.com> wrote:
Even if you don't use the numpy-mimicking parts of the xtensor API, xtensor-python is a probably a net improvement over pybind11 for communicating arrays back and forth across the C++/Python boundary. Even if the rest of your C++ code doesn't use xtensor, you could profitably use xtensor-python at the interface. Also, though the article is generally framed as using Python as a glue language (i.e. communicating with existing C/C++/Fortran code), it is also relevant for the use case where you are writing the C/C++/Fortran code from scratch (perhaps just accelerating small kernels or whatever). Talking about the available options for that use case is perfectly on-topic for that article. You don't have to be the one that writes it, though, if you just want to cover pybind11. -- Robert Kern

Dear Robert,
On 17. Aug 2018, at 23:44, Robert Kern <robert.kern@gmail.com> wrote:
Even if you don't use the numpy-mimicking parts of the xtensor API, xtensor-python is a probably a net improvement over pybind11 for communicating arrays back and forth across the C++/Python boundary. Even if the rest of your C++ code doesn't use xtensor, you could profitably use xtensor-python at the interface. Also, though the article is generally framed as using Python as a glue language (i.e. communicating with existing C/C++/Fortran code), it is also relevant for the use case where you are writing the C/C++/Fortran code from scratch (perhaps just accelerating small kernels or whatever). Talking about the available options for that use case is perfectly on-topic for that article.
no objections here, xtensor should be highlighted in the pybind11 part for these reasons. I just think it should not be a separate section. Best regards, Hans

On Mon, Aug 20, 2018 at 8:57 AM, Neal Becker <ndbecker2@gmail.com> wrote:
I'm confused, do you have a link or example showing how to use xtensor-python without pybind11?
I think you may have it backwards: """ The Python bindings for xtensor are based on the pybind11 C++ library, which enables seemless interoperability between C++ and Python. """ So no, yu can't use xtenson-python without pybind11 -- I think what was suggested was that you *could* use xtenson-python without using xtenson on the C++ side. i.e. xtensor-python is a higher level binding system than pybind11 alone, rather than just bindings for xtensor. And thus belongs in the docs about binding tools. Which makes me want to take a closer look at it... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Hi, ke, 2018-08-15 kello 11:40 +0200, Hans Dembinski kirjoitti: [clip: pybind11]
Please go ahead --- the relevant source file is here: https://github.com/numpy/numpy/blob/master/doc/source/user/c-info.python-as-... It also does no mention CFFI either, mostly because the original text predates the project. -- Pauli Virtanen

Hi Hans, Pauli, If `pybind11` is included, it could be interesting to also include `xtensor` and `xtensor-python`. - Xtensor is a C++ dynamic N-d array library that offers numpy-like features including broadcasting and universal functions. It is also lazy evaluated and continuously benchmarked against numpy, eigen, pythran and numba. You can check out the numpy to xtensor cheat sheet: https://xtensor.readthedocs.io/en/latest/numpy.html. - Xtensor-python makes it possible to operate on numpy arrays inplace using the xtensor API. So that e.g. an xtensor reshape will result in a reshape on the python side (using the numpy C API under the hood). Xtensor-python is built upon pybind11, but brings it much closer to feature parity with NumPy. There is a vibrant community of users and developers, actively working to make xtensor faster and cover more of numpy APIs. I would argue that xtensor-python is one of the easiest ways to make use of numpy arrays from a C++ program, given the similar high level API, and tools to make ufuncs and bindings with one-liners. Resources: - xtensor: https://github.com/QuantStack/xtensor (documentation: https://xtensor.readthedocs.io/) - xtensor-python: https://github.com/QuantStack/xtensor-python (documentation: https://xtensor-python.readthedocs.io/) - xtensor-blas: https://github.com/QuantStack/xtensor-blas (documentation: https://xtensor-blas.readthedocs.io) - xtensor-io: https://github.com/QuantStack/xtensor-io (documentation: https://xtensor-io.readthedocs.io) for reading and writing various file formats Other language bindings: - xtensor-julia: https://github.com/QuantStack/xtensor-julia (documentation: https://xtensor-julia.readthedocs.io/en/latest/) - xtensor-r: https://github.com/QuantStack/xtensor-r (documentation: https://xtensor-r.readthedocs.io/en/latest/) Best, Sylvain On Wed, Aug 15, 2018 at 12:50 PM, Pauli Virtanen <pav@iki.fi> wrote:

Hi Sylvain,
sounds good, I think it should be mentioned in the pybind11 part. I just stumbled over xtensor yesterday. Based on your post I read a bit more about it. I like the expression engine and lazy evaluation, the concept is similar to Eigen. xtensor itself has nothing to do with binding, but makes working with numpy arrays on the C++ side easier - especially when you are familiar with the numpy API. The docs say: "Xtensor operations are continuously benchmarked, and are significantly improved at each new version. Current performances on statically dimensioned tensors match those of the Eigen library. Dynamically dimension tensors for which the shape is heap allocated come at a small additional cost." I couldn't find these benchmark results online, though, could you point me to the right page? Google only produced an outdated SO post where numpy performed better than xtensor. Best regards, Hans PS: A bit of nitpicking: you use the term "tensor" for an n-dimensional block of numbers - a generalisation of "matrix", but the term "tensor" in mathematics and physics is more specific. A tensor has well-defined transformation properties when you change the basis of your vector space, just like a "vector" (a vector is a one-dimensional tensor), while a general block of numbers does not. https://en.wikipedia.org/wiki/Tensor

Hi Hans, On Thu, Aug 16, 2018 at 10:51 AM, Hans Dembinski <hans.dembinski@gmail.com> wrote:
Actually, xtensor-python does a lot more in terms of numpy bindings, as it uses the C APIs of numpy directly for a number of things. Plus, the integration into the xtensor expression system lets you do things such as view / broadcasting / newaxis / ufuncs directly from the C++ side (and all that is in the cheat sheets).
That is because we run the benchmarks on our own hardware. Since xtensor is explicitly SIMD accelerated for a variety of architectures including e.g. avx512, it is hard to have a consistent environment to run the benchmarks. We have a I9 machine that runs the benchmarks with various options, and manually run them on raspberry pis for the neon acceleration benchmarks (continuous testing of neon instruction sets are tested with an emulator on travisci in the xsimd project). Cheers, Sylvain
it is clearly a very overloaded term.

Hi Sylvain,
ok, good, but my point was different. The page in question is about Python as a glue language. The other solutions on that site are general purpose binding solutions for any kind of C++ code, while xtensor-python is xtensor-specific. xtensor in turn is a library that mimics the numpy API in C++.
Ok, but you can still put the results for everyone to see and judge by themselves on a web page. Just state on what kind of machine you ran the code. It is ok if the results on my machine differ, I am still interested in the results that you get on your machines and since you generate them anyway, I don't see why not.
[tensor] is clearly a very overloaded term.
I agree that vector is a very overloaded term (the STL vector is particularly to blame). But until recently, tensor used to be a well-defined technical term which exclusively referred to a specific mathematical concept https://en.wikipedia.org/wiki/Tensor_(disambiguation) https://medium.com/@quantumsteinke/whats-the-difference-between-a-matrix-and... <https://medium.com/@quantumsteinke/whats-the-difference-between-a-matrix-and...> https://www.quora.com/What-is-a-tensor <https://www.quora.com/What-is-a-tensor> Then Google started to popularise the term wrongly in Tensorflow and now another well-defined technical term gets watered down. xtensor is going with the tensorflow, sad. Best regards, Hans

On Fri, Aug 17, 2018 at 2:00 AM Hans Dembinski <hans.dembinski@gmail.com> wrote:
Even if you don't use the numpy-mimicking parts of the xtensor API, xtensor-python is a probably a net improvement over pybind11 for communicating arrays back and forth across the C++/Python boundary. Even if the rest of your C++ code doesn't use xtensor, you could profitably use xtensor-python at the interface. Also, though the article is generally framed as using Python as a glue language (i.e. communicating with existing C/C++/Fortran code), it is also relevant for the use case where you are writing the C/C++/Fortran code from scratch (perhaps just accelerating small kernels or whatever). Talking about the available options for that use case is perfectly on-topic for that article. You don't have to be the one that writes it, though, if you just want to cover pybind11. -- Robert Kern

Dear Robert,
On 17. Aug 2018, at 23:44, Robert Kern <robert.kern@gmail.com> wrote:
Even if you don't use the numpy-mimicking parts of the xtensor API, xtensor-python is a probably a net improvement over pybind11 for communicating arrays back and forth across the C++/Python boundary. Even if the rest of your C++ code doesn't use xtensor, you could profitably use xtensor-python at the interface. Also, though the article is generally framed as using Python as a glue language (i.e. communicating with existing C/C++/Fortran code), it is also relevant for the use case where you are writing the C/C++/Fortran code from scratch (perhaps just accelerating small kernels or whatever). Talking about the available options for that use case is perfectly on-topic for that article.
no objections here, xtensor should be highlighted in the pybind11 part for these reasons. I just think it should not be a separate section. Best regards, Hans

On Mon, Aug 20, 2018 at 8:57 AM, Neal Becker <ndbecker2@gmail.com> wrote:
I'm confused, do you have a link or example showing how to use xtensor-python without pybind11?
I think you may have it backwards: """ The Python bindings for xtensor are based on the pybind11 C++ library, which enables seemless interoperability between C++ and Python. """ So no, yu can't use xtenson-python without pybind11 -- I think what was suggested was that you *could* use xtenson-python without using xtenson on the C++ side. i.e. xtensor-python is a higher level binding system than pybind11 alone, rather than just bindings for xtensor. And thus belongs in the docs about binding tools. Which makes me want to take a closer look at it... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
participants (6)
-
Chris Barker
-
Hans Dembinski
-
Neal Becker
-
Pauli Virtanen
-
Robert Kern
-
Sylvain Corlay