![](https://secure.gravatar.com/avatar/81b0eda1a755030312d63998980d6b7a.jpg?s=120&d=mm&r=g)
Hello, I posted a response to your blog post on C++ library bindings, and wanted to continue the discussion further via email if anyone's interested. I just signed up for the mailing list, so apologies if I missed a lot of previous discussion. I'll say up front that it's unlikely that I'll be able to devote any actual coding effort to this, so feel free to tell me to get lost if you have plenty of ideas and not enough manpower. :) I started out writing C++ bindings using Boost.Python, and was very happy with it for a long time. It's strongest point is the ability to wrap libraries that were never designed with python in mind, specifically code with poor and inflexible ownership semantics. Internally, this means that C++ objects are exposed indirectly through a holder object containing either an inline copy of the C++ object or any type of pointer holding the object. Every access to the object has to go through runtime dispatch in order to work with any possible holder type. The holder also contains the logic for ownership and finalization. For example, Boost.Python can return a reference to a field inside another object, in which case the holder will keep a reference to the parent object to keep it alive as long as the field reference lives. The problem with this generality is that it produces a huge amount of object code (wrapping a single function in Boost.Python can add 10k to the object file), and adds a lot of runtime indirection. Assuming that one is writing C++ bindings because of speed issues, it'd be nice if this extra layer of memory indirection and runtime dispatch was exposed to the (eventual) JIT. In order to do that, pypy would have to be capable of handling pointers to raw memory containing non-python objects (is already true due to ctypes stuff?), with separate information about type and ownership. For example, if you have bindings for a C++ vector class and a C++ array containing the vectors, a "reference" to an individual vector in the array is really three different pieces: 1. The actual pointer to the vector. 2. A type structure containing functions to be called with the pointer (1) as an argument. 3. A list of references to other objects that need to stay alive while this reference lives. If pypy and the JIT ends up able to treat these pieces separately, it'd be a significant performance win over libraries wrapped with CPython. The other main source of slowness and complexity in Boost.Python is overloading support, but I think that part is fairly straightforward to handle in the python level. All Boost.Python does internally is loop over the set of functions registered for a given name, and for each one loop over the arguments calling into its converter registry to see if the python object can be converted to the C++ type. As I mentioned in the blog comment, a lot of these issues come up in contexts outside C++, like numpy. Internally numpy represents operations like addition as a big list of optimized routines to call depending on the stored data type. Functions in these tables are called on raw pointers to memory, which is fundamental since numpy arrays can refer to memory inside objects from C++, Fortran, mmap, etc. It'd be really awesome if the type dispatch step could be written in python but still call into optimized C code for the final arithmetic. The other major issue is safety: if a lot of overloading and dispatch code is going to be written in python, it'd be nice to shield that code from segfaults. I think you can get a long way there just by having a consistent scheme for boxing the three components above (pointer, type, and reference info), a way to label C function pointers with type information, a small RPython layer that did simple type-checked calls (with no support for overloading or type conversion). I just wrote a C++ analogue to this last part as a minimal replacement for Boost.Python, so I could try to formulate what I mean in pseudocode if there's interest. There'd be some amount of duplicate type checking if higher level layers such as overload resolution were written in application level python, but that duplication should be amenable to elimination by the JIT. That's enough for now. I'll look forward to the discussion. Most of my uses of python revolve heavily around C++ bindings, so it's exciting to see that you're starting to think about it even if it's a long way off. Geoffrey
![](https://secure.gravatar.com/avatar/bfc96d2a02d9113edb992eb96c205c5a.jpg?s=120&d=mm&r=g)
Hey. First sorry for late response, we're kind of busy doing other things now (ie working on 2.5-compatible release). That doesn't mean we don't appreciate input about our problems. On Fri, Oct 17, 2008 at 5:50 AM, Geoffrey Irving <irving@naml.us> wrote:
Hello,
I posted a response to your blog post on C++ library bindings, and wanted to continue the discussion further via email if anyone's interested. I just signed up for the mailing list, so apologies if I missed a lot of previous discussion. I'll say up front that it's unlikely that I'll be able to devote any actual coding effort to this, so feel free to tell me to get lost if you have plenty of ideas and not enough manpower. :)
That's fine. We don't have enough manpower to work on this now, but knowing what people do in this area is very valuable once we get to it.
I started out writing C++ bindings using Boost.Python, and was very happy with it for a long time. It's strongest point is the ability to wrap libraries that were never designed with python in mind, specifically code with poor and inflexible ownership semantics. Internally, this means that C++ objects are exposed indirectly through a holder object containing either an inline copy of the C++ object or any type of pointer holding the object. Every access to the object has to go through runtime dispatch in order to work with any possible holder type. The holder also contains the logic for ownership and finalization. For example, Boost.Python can return a reference to a field inside another object, in which case the holder will keep a reference to the parent object to keep it alive as long as the field reference lives.
The problem with this generality is that it produces a huge amount of object code (wrapping a single function in Boost.Python can add 10k to the object file), and adds a lot of runtime indirection.
Assuming that one is writing C++ bindings because of speed issues, it'd be nice if this extra layer of memory indirection and runtime dispatch was exposed to the (eventual) JIT. In order to do that, pypy would have to be capable of handling pointers to raw memory containing non-python objects (is already true due to ctypes stuff?)
That's true. PyPy is able to handle pointers to any C place.
.. with separate information about type and ownership.
We don't provide this, since C has no notion of that at all.
For example, if you have bindings for a C++ vector class and a C++ array containing the vectors, a "reference" to an individual vector in the array is really three different pieces:
1. The actual pointer to the vector. 2. A type structure containing functions to be called with the pointer (1) as an argument. 3. A list of references to other objects that need to stay alive while this reference lives.
If pypy and the JIT ends up able to treat these pieces separately, it'd be a significant performance win over libraries wrapped with CPython.
The other main source of slowness and complexity in Boost.Python is overloading support, but I think that part is fairly straightforward to handle in the python level. All Boost.Python does internally is loop over the set of functions registered for a given name, and for each one loop over the arguments calling into its converter registry to see if the python object can be converted to the C++ type.
As I mentioned in the blog comment, a lot of these issues come up in contexts outside C++, like numpy. Internally numpy represents operations like addition as a big list of optimized routines to call depending on the stored data type. Functions in these tables are called on raw pointers to memory, which is fundamental since numpy arrays can refer to memory inside objects from C++, Fortran, mmap, etc. It'd be really awesome if the type dispatch step could be written in python but still call into optimized C code for the final arithmetic.
That's the goal. Well, not exactly - point is that you write this code in Python/RPython and JIT is able to generate efficient assembler out of it. That's a very far-reaching goal though to have nice integration between yet-non-existant JIT and yet-non-existant PyPy's numpy :-)
The other major issue is safety: if a lot of overloading and dispatch code is going to be written in python, it'd be nice to shield that code from segfaults. I think you can get a long way there just by having a consistent scheme for boxing the three components above (pointer, type, and reference info), a way to label C function pointers with type information, a small RPython layer that did simple type-checked calls (with no support for overloading or type conversion). I just wrote a C++ analogue to this last part as a minimal replacement for Boost.Python, so I could try to formulate what I mean in pseudocode if there's interest. There'd be some amount of duplicate type checking if higher level layers such as overload resolution were written in application level python, but that duplication should be amenable to elimination by the JIT.
I think for now we're happy with extra overhead. We would like to have *any* working C++ bindings first and then eventually think about speeding it up.
That's enough for now. I'll look forward to the discussion. Most of my uses of python revolve heavily around C++ bindings, so it's exciting to see that you're starting to think about it even if it's a long way off.
Thank you :) Cheers, fijal
![](https://secure.gravatar.com/avatar/81b0eda1a755030312d63998980d6b7a.jpg?s=120&d=mm&r=g)
On Thu, Oct 23, 2008 at 5:25 AM, Maciej Fijalkowski <fijall@gmail.com> wrote:
Hey.
First sorry for late response, we're kind of busy doing other things now (ie working on 2.5-compatible release). That doesn't mean we don't appreciate input about our problems.
On Fri, Oct 17, 2008 at 5:50 AM, Geoffrey Irving <irving@naml.us> wrote: <snip>
That's true. PyPy is able to handle pointers to any C place.
.. with
separate information about type and ownership.
We don't provide this, since C has no notion of that at all.
At the lowest level the type is just a hashable identifier object, so it can probably implemented at the RPython level. E.g., # RPython type-safety layer class CppObject: def __init__(ptr, type): self.ptr = ptr # pointer to the actual C++ instance self.type = type # represents the C++ type self.destructor = type.destructor # function pointer to destructor def __traverse__(self): ... traverse through list of contained python object pointers ... def __del__(self): CCall(self.destructor, self.ptr) class CppFunc: def __init__(ptr, resulttype, argtypes): self.ptr = ptr self.resulttype = resulttype self.argtypes = argtypes def __call__(self, *args): if len(args) != len(self.argtypes): raise TypeError(...) argptrs = [] for a,t in zip(args,self.argtypes): if not isinstance(a, CppObject) or a.type != t: raise TypeError(...) argptrs.append(a.ptr) resultptr = Alloc(self.resulttype.size) try: CppCall(self.ptr, resultptr, *argptrs) # assumes specific calling convention except CppException, e: # CppCall would have to generate this Dealloc(resultptr) raise CppToPythonException(e) return CppObject(resultptr, self.resulttype) If this layer is written in RPython, features like overload resolution and C++ methods can be written in application-level python without worring about safety.
<snip>
As I mentioned in the blog comment, a lot of these issues come up in contexts outside C++, like numpy. Internally numpy represents operations like addition as a big list of optimized routines to call depending on the stored data type. Functions in these tables are called on raw pointers to memory, which is fundamental since numpy arrays can refer to memory inside objects from C++, Fortran, mmap, etc. It'd be really awesome if the type dispatch step could be written in python but still call into optimized C code for the final arithmetic.
That's the goal. Well, not exactly - point is that you write this code in Python/RPython and JIT is able to generate efficient assembler out of it. That's a very far-reaching goal though to have nice integration between yet-non-existant JIT and yet-non-existant PyPy's numpy :-)
Asking the JIT to generate to generate efficient code might be sufficient in this case, but in terms of this discussion it just removes numpy as a useful thought experiment towards C++ bindings. :) Also for maximum speed I doubt the JIT will be able to match custom code such as BLAS, given that C++ compilers usually don't get there either.
The other major issue is safety: if a lot of overloading and dispatch code is going to be written in python, it'd be nice to shield that code from segfaults. I think you can get a long way there just by having a consistent scheme for boxing the three components above (pointer, type, and reference info), a way to label C function pointers with type information, a small RPython layer that did simple type-checked calls (with no support for overloading or type conversion). I just wrote a C++ analogue to this last part as a minimal replacement for Boost.Python, so I could try to formulate what I mean in pseudocode if there's interest. There'd be some amount of duplicate type checking if higher level layers such as overload resolution were written in application level python, but that duplication should be amenable to elimination by the JIT.
I think for now we're happy with extra overhead. We would like to have *any* working C++ bindings first and then eventually think about speeding it up.
Another advantage of splitting the code into an RPython type-safety layer and application-level code is that the latter could be shared with between pypy and cpython. I haven't looked at reflex at all, but in Boost.Python most of the complexity goes into code that could exist at the application-level. Geoffrey
![](https://secure.gravatar.com/avatar/5b37e6b4ac97453e4ba9dba37954cf79.jpg?s=120&d=mm&r=g)
Hi Geoffrey, On Thu, Oct 23, 2008 at 10:30:33AM -0700, Geoffrey Irving wrote:
If this layer is written in RPython, features like overload resolution and C++ methods can be written in application-level python without worring about safety.
I think that in this area, any worries of safety went out of the window when CPython officially adopted ctypes. So it's completely fine if the application-level dispatching layer has access to unsafe features, as long as user applications that only use the dispatching layer are shielded from crashes. A bientot, Armin.
participants (3)
-
Armin Rigo
-
Geoffrey Irving
-
Maciej Fijalkowski