
Hi all, So I've been working on trying to develop PyPy bindings for the Bullet physics library (http://bulletphysics.org/) using cppyy. So far, after a bit of an initial learning curve, I was able to put together a configuration that maps almost all of the standard Bullet classes and functions, and worked surprisingly well "out of the box" (with no fixup code or anything), and have even successfully ported a basic "hello world" Bullet example from one of the tutorials to Python. (I have to say, I'm pretty impressed with how smoothly it all works!) The one main issue I'm left with at this point, however, is that for any sort of real application, Bullet makes substantial use of callbacks (both in the form of some global C++ function pointers as well as some abstract "callback" classes with virtual methods intended to be subclassed and overridden). My understanding is that cppyy does not currently support any form of calling Python code from C++ (only the other way around), so this presents a bit of a problem.
I've been futzing around with a workaround option in the meantime which I think might work: 1. Wrap the C++ function pointers/virtual functions with stub code which calls a C function pointer instead, passing C++ objects as "void *" 2. Write helper C++ functions which can accept "void *" and return a result casted to the appropriate C++ type 3. Use cffi or ctypes to have the C function pointers call back into a Python wrapper function, which then calls the helper conversion functions via cppyy to convert the "void *"s back into the correct cppyy objects, and then calls the actual Python callback function/method with those as arguments. Obviously, this is kinda clunky and involves a fair bit of overhead, which I'd really like to avoid, so I'm also curious if anybody has any other suggestions for better ways to do this sort of thing? Any help would be much appreciated! --Alex

What about the CINT backend? https://bitbucket.org/pypy/pypy/src/52bbec630c1faec5ea0c1772872411de541d507f/pypy/module/cppyy/test/test_cint.py<https://bitbucket.org/pypy/pypy/src/52bbec630c1faec5ea0c1772872411de541d507f/pypy/module/cppyy/test/test_cint.py?at=default> On Thu, Jan 9, 2014 at 12:14 PM, Alex Stewart <foogod@gmail.com> wrote:
-- Ryan When your hammer is C++, everything begins to look like a thumb.

I'd looked around a bit but could only find vague references to CINT, and it wasn't even clear to me whether a full CINT backend really existed or it was just a hack/experiment. Is it actually suitable for general-purpose use? If so, I'd certainly be happy to try it.. how would one go about switching to using the CINT backend instead of Reflex? --Alex On Thu, Jan 9, 2014 at 3:22 PM, Ryan Gonzalez <rymg19@gmail.com> wrote:

On Fri, Jan 10, 2014 at 1:58 AM, Alex Stewart <foogod@gmail.com> wrote:
Hey Alex. On a sidenote - can you please subscribe to pypy-dev so I don't have to authorize every single one of your mail? It's relatively painless and it's also very low traffic, so you won't get much uninteresting stuff. Obviously feel free to unsubscribe once you're done. Cheers, fijal

On a sidenote - can you please subscribe to pypy-dev so I don't have to authorize every single one of your mail?
I'm a bit confused.. I've been subscribed to pypy-dev (and receiving list mail) since some time in August. (I just went to the mailman page for the list and logged in with my password and it seems to know who I am too..) I just tried unsubscribing and resubscribing, so maybe that will help clear things up? --Alex

Hi Alex,
it's quite alive; in high energy physics, Reflex is only used by mapping Reflex information into CINT, then use it from CINT. Is historic, though, and not recommended in general.
Is it actually suitable for general-purpose use?
If you're willing to install all of ROOT? (Or a minimal version anyway?) On the one hand, I'd argue against that; on the other, ROOT is available in many science sections of Linux distro's as well as in MacPorts, so it's not that big of a deal. But also the run-time dependencies increase. Anyway, the Reflex backend is the default precisely b/c it does not add any further dependencies. Also, CINT per se does not provide what you want (the code that allows compiling in extra parts is in ROOT proper).
If so, I'd certainly be happy to try it.. how would one go about switching to using the CINT backend instead of Reflex?
Is documented here, can only be done "builtin": http://doc.pypy.org/en/latest/cppyy_backend.html I never made a standalone libcppyy_backend library for CINT, as I don't expect there to be any use (physicists in HEP use by and large only releases provided by their experiments' software group; and CINT should be on its way out now that we have Cling largely working). Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

Hey Wim, Thanks for all the useful info on the status of things!
Yeah, I kinda figured this would be required regardless, and that bit isn't really that onerous.. it was mainly the extra cffi -> Python -> C++ -> Python round-trip I was hoping to be able to avoid somehow..
For global functions, I've found helper with void* + cffi (the way you've
described in your mail) enough. Yes, that isn't satisfactory, but last
year the priority has been Cling.
Entirely understandable (and the Cling stuff sounds like a better solution all round, so I agree with that priority). At least now I know I'm on the right track for an interim solution, and something better is actually on the way. :) As such, it all took about a year longer, but since end of November last
Out of curiosity, how much work do you expect the cppyy part of things to be? (I'd offer to help, but I suspect it's all way over my head..) This does also beg another question that's sorta been at the back of my mind for a bit: Since all of the underlying stuff gets done in CPython first, Is there any chance that at some point we might see a version of cppyy for CPython as well (with similar usage and dependencies)? It's not something I really need for this particular project, but it would be really cool if the Bullet library bindings I'm making (and potentially other C++ libraries) could eventually be used from either PyPy or CPython interchangeably.. I never made a standalone libcppyy_backend library for CINT, as I don't
That's ok, by the sound of things the Cling-based stuff is moving along well and I can probably wait around a bit for that instead. :) Anyway, thanks for the diligent work! I'm really looking forward to seeing the cling-cppyy (and will be happy to test, etc) when it's ready. --Alex

Hi Alex,
Out of curiosity, how much work do you expect the cppyy part of things to be? (I'd offer to help, but I suspect it's all way over my head..)
the work it self is mostly transliterating code. Cleaning it up and removing dependencies is a different matter.
That is the plan, as I have some folks here that are demanding it. As a complete aside ... I'm convinced that the version of cppyy with the loadable C-API could just as well be served with something that SWIG could produce. Not with the one that produces CPython extensions, but using the SWIG parser to generate something akin to Reflex dictionaries. That is not so far off as it may seem, as it does most of that already anyway, just cross-sprinkled with Python C-API calls that would need to be removed. The C-API that needs to be filled in for cppyy is really straightforward, and can be implemented incrementally. (I've added a dummy_backend.cxx to allow some tests to proceed even when genreflex is not installed and that proves the point sufficiently.) The API could also be modified, if need be. SWIG is not useful for us (it can't parse our headers), and some .py would still need to be generated (to provide the exact same interface as it does for CPython), but a 'swig -pypy' based on this should not be too hard. Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

Hi Ryan,
What about the CINT backend?
the way we "jit" code with CINT, is by generating a temporary file, then running the dictionary generator over it, and pulling everything through the C++ compiler. Works, but ... Also, the CINT backend carries more dependencies than strictly necessary, for historic reasons. Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

Hi Alex,
I have to say, I'm pretty impressed with how smoothly it all works!
good to hear.
Indeed. I've had two occasions (not with Reflex, but with CINT) where I needed to call the other way around: fitting of data distributions with python functions, and callbacks when C++ objects get deleted. I've used a custom piece of rpython for both. Overriding of virtual functions can only be done with a helper class deriving from the callback interface on the C++ side. There's no other (portable) way. For global functions, I've found helper with void* + cffi (the way you've described in your mail) enough. Yes, that isn't satisfactory, but last year the priority has been Cling. Which answers this:
The status is that we'd way underestimated the amount of work. When working with CINT or Reflex, it was always possible to have something functional, as parts that were not working yet, could be handled gracefully. Not so with Clang: if you don't handle something, it will exit one way or another (as it is supposed to do: a compiler stops on bad input). As such, it all took about a year longer, but since end of November last year, we now have a working bindings generator for Cling, but only on CPython. I've only just started (this week) with the groundwork of getting a back-end for cppyy into shape.
Calling from Cling into Python has always been part of it. From there, creating callable function pointers with a wrapper is not so much work. Derived classes is somewhat more convoluted, but also straightforward (has been done before, after all). (The bigger point of using Clang/LLVM was C++11, though, so that has some priority, although so far it seems to be in a rather good shape.) Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

For global functions, I've found helper with void* + cffi (the way you've described in your mail) enough.
Ok, so I think I'm really close to having something working now, except for one thing: I can't figure out how to access/pass a void pointer via cppyy? If I declare a C++ function/method as taking a "void *" argument, I can't figure out how to construct a Python object/value to pass to it that it will accept (whatever I try (for example, when trying an int, such as what cppyy.addressof returns), I just get "TypeError: 'CPPInstance' object expected, got 'int' instead").. Is there some special way to construct a cppyy "void *" object or something I'm not aware of? I also noticed when I tried defining a global "void *" variable, like so: void *voidp; That any attempt to access it from Python yields:
This feels like I must be missing something obvious here..? --Alex

So I actually worked around this problem by not using "void *" at all and passing around intptr_t values instead, which actually works better anyway since I realized I can pass them directly to/from cppyy.bind_object and cppyy.addressof (which makes the code come out much cleaner than I'd originally hoped, actually).. --Alex

Hi Alex, sorry for not responding earlier; had a bit of rough week at work.
So I actually worked around this problem by not using "void *" at all and passing around intptr_t values instead,
Yes, I was going to suggest that. :) But I'll first start implementing void* now. Later, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

sorry for not responding earlier; had a bit of rough week at work.
No worries.. I'm used to some lists where people can take a week or more to respond, so this is actually a refreshing change :) But I'll first start implementing void* now.
That would be awesome, if it's not too much trouble.. Using intptr_t is working fairly well for the callback stuff, but there are some other places in the library where void pointers are passed around (for example, assigning "user data" to some types of objects), which I've been trying to figure out how I would deal with if I couldn't use "void*", so it would definitely be useful.. Sorry to be a pest, but I've got one more question: Currently, for the callbacks, I'm converting the C++ object pointer to an intptr_t, and then on the Python side using cppyy.bind_object to convert that intptr_t back to an object of the specified type. I've noticed, however, that if I (for example) call a function which returns an object of BaseClass, but the object being returned actually happens to be a DerivedClass object, cppyy (somehow) figures this out and it comes through as a DerivedClass automatically in Python (so you don't need to cast it). If, however, the same object gets passed back via a callback (and I use cppyy.bind_object(ptr, BaseClass)), it comes out as a BaseClass regardless (so the Python code needs to know it's actually a DerivedClass somehow, and manually cast it). Is there some way I could do something like bind_object, but have it do the automagical "figure out what derived class this actually is and return that instead" behavior that cppyy can apparently do in other places? --Alex

Hi Alex,
That would be awesome, if it's not too much trouble..
well, void* is trouble by definition, but it has to be dealt with. Doing something with it, is actually not that hard, but getting it consistent is. Some code is in, on the reflex-support branch (to be merged once it is documented): void* is represented as an array (I didn't see a PyCObject or PyCapsule outside of cpyext), both for returns and data members. There is a special case cppyy.gbl.nullptr (C++11 style), which is a unique object to represent NULL. ATM, both int(0) and None can also be used in place of nullptr, but the next step will be to remove that (None is void, not NULL; and 0 is archaic). Is also just a matter of being consistent. addressof() has been changed to accept an array, to return its start address as an integer; bind_object() can take both the result of addressof() as well as the array directly.
Passing C++ instances through void* was already supported, and so that should be fine. There was an issue if the code started out with a void* (e.g. from a function call), but that now works as well. (That consistency thing again.)
Yes, this is based on C++'s (limited) RTTI: the dictionary stores class info both by (human readable) class name as well as by typeid. A pre-compiled (or generated, in the case of Cling) cast function to calculate the offset does the rest. (And if the inheritance is non-virtual, the JIT will elide all of that machinery.)
I've added a parameter 'cast' which can be set to True. Note that the call to bind_object already takes another boolean parameter, owns, so better use keyword args. Note that the class name and the pointer value still have to match (was already true before). There is one more caveat: if a void* pointer was first bound to a base class, and is later bound to a derived class (or vice versa), the memory regulator will not work, so any existing python object will not be reused. Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

(I'm a little unclear what it means to represent "void*" as an array (array of what?), but the rest of cppyy seems to be pretty well thought out, so I'll trust you know what you're doing :) ) Out of curiosity, is there some reason you didn't name that cppyy.gbl.NULL instead? It seems to me that "nullptr" could potentially cause namespace collisions if some library decided to have a function/variable by that name, whereas "NULL" would be consistent with standard C++ naming.. Passing C++ instances through void* was already supported, and so that
For clarification, the specific use case I'm looking at is objects/functions that use "void *" as an "opaque pointer to application data" that the library doesn't want/need to care about (just pass back to the application sometimes). In this context, it would be really nice if there was some easy way to use "void*" to represent a pure Python object as well (not just C++ instances). Obviously, in pypy objects don't have static addresses, so we can't just use a pointer to the Python object, but I was considering the possibility of taking an index or id() of the object and just casting it into "void*" form, and then converting it back and using it to look up objects in a list/dict somewhere.. (Alternately, it occurs to me now one could define a C++ "proxy" object which could store that sort of lookup information, and then the address of the proxy could be stored in the void*, which might be a little more kosher..) In any case, I wanted to make sure that use case was something that would be feasible in your plan for how void pointers will ultimately work.. --Alex

Hi Alex,
(I'm a little unclear what it means to represent "void*" as an array (array of what?),
exactly. A PyCObject is useless by design (but lacking in PyPy AFAICT, other than in cpyext, which use would be slow), as is an array of void. It is the best equivalent I could come up with ... I figured that the only thing I ever wanted from a PyCObject, is to get the C-side address. That is why cppyy.addressof() does provide that for arrays of void now. Beyond that, PyCObject's serve to go from C -> Python -> C, with no work in Python. This void array can do that. The only part missing is an as_cobject (one that returns a cpyext-based, CPython PyCObject), to communicate with other extension libraries that use the CPython C-API. Such a function is needed for all other C++ instances also, so I'll add that (PyROOT has it to allow integration with PyQt, for example).
Is unlikely, as 'nullptr' is a C++11 keyword. I've been thinking of putting it under 'cppyy' rather than 'cppyy.gbl', but the latter seemed to make more sense (it's not a module feature, but a C++ feature).
Okay, I understand better now.
Obviously, in pypy objects don't have static addresses, so we can't just use a pointer to the Python object
That is solvable (cppyy can bind functions that take/provide PyObject* as it simply hands over the work to cpyext). See for example test_crossing.py in pypy/module/cppyy/test. The larger problem would be that any randomly given python object does not have a guaranteed layout.
Alex Pyattaev did this (is in the pypy-dev mail archives, somewhere around early Summer 2012).
In any case, I wanted to make sure that use case was something that would be feasible in your plan for how void pointers will ultimately work..
Let me know if you see anything missing. "void*" is an interesting puzzle, without many people (that I know of) interested in it. The much bigger pain that we have, is the use of "char*" where the intended usage is "byte*". (And I don't actually have a solution for that.) Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

Hi, On Thu, Jan 23, 2014 at 10:36 PM, <wlavrijsen@lbl.gov> wrote:
This exists in CFFI (see ffi.new_handle()). More generally, wouldn't it make some sense to try to bring cppyy closer to cffi? It is a change of some of the basics, so only suitable for people without an existing code base. What I'm thinking of is if we could consider a cppyy-like interface as an extension of the cffi interface, like C++ is an extension of C, or if it doesn't really make sense. A bientôt, Armin.

Hi Armin,
More generally, wouldn't it make some sense to try to bring cppyy closer to cffi?
yes, it does. Is high on my wish list (and not just the programmer-facing side, also the internals although there are several things that are close but don't quite fit). At the top is cling, though, and time is lacking. Of course, if you know a volunteer who can help here? :) Thanks, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

Hi Wim, On Fri, Jan 24, 2014 at 6:28 PM, <wlavrijsen@lbl.gov> wrote:
I meant specifically the way the interface is to be used by the end programmer, ignoring its implementation for now. This would mean moving to more cffi-like idioms: removing the implicit ownership-tracking logic, not guessing too hard about which overloaded function is meant to be called, and so on --- generally "explicit is better than implicit" and if the end programmer needs to expose some library in a simpler interface, he can use a Python layer on top of it (or write his own). So this would end up somewhere slightly different than where cppyy is currently. A bientôt, Armin.

Hi Armin,
that is all possible, and largely there, just the other way around than:
By default, there is memory tracking, auto-casting, overloading, template instantiation (with cling; partially with cint), etc. And if that is not desired, a Python layer can be written to do things differently. E.g. to select a specific overload, use the "__dispatch__" function. I don't think that that will be workable, though. Maybe I should have bitten on the "like C++ is an extension of C" comment. :) Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

Hi Wim, On Fri, Jan 24, 2014 at 6:56 PM, <wlavrijsen@lbl.gov> wrote:
So you're basically answering "no" :-) As you're describing it, cppyy is at the "ctypes" level of providing automatic everything and lots of options and ways to work around them if they are not wanted. CFFI is at the opposite end of the spectrum on this respect. But that is for C. I don't have enough practical knowledge of C++ to judge how viable a "CPPFFI" would be. A bientôt, Armin.

Hi Armin,
So you're basically answering "no" :-)
no, I'm not saying no. (Okay, now I did twice. :) ) I did say that it was high on my wish list, after all. I know that there are many people who like to work the explicit way, which is why such an interface needs to be provided. And it can be done, as it already (mostly) exists, just scattered. Further, module level stuff already lives on 'cppyy.' and C++ stuff lives on 'cppyy.gbl'. So I think that this can be totally supported and faithfully follow CFFI, without clashing. I just think that it's a historic artefact that folks want this, not that they have good reasons for it. CFFI has an additional advantage, by not having heavy dependencies. Good luck writing your own pycppparser, though. :/ (Separate from ABI concerns.) So if with C++, a full C++ compiler is pulled into the mix anyway, why content with so much hand-coding?
As you're describing it, cppyy is at the "ctypes" level of providing automatic everything
There's nothing automatic about ctypes. :? What I mean is something like e.g. std::map<MyKey, MyValue>. Why would anyone want to re-invent the pythonizations, if one can just do: for key, value in my_cpp_map: # ... whatever directly from the automatic bindings and pythonizations provided by cppyy?
and lots of options and ways to work around them if they are not wanted
Right, but it's work either way? If no automation is provided, then that work that could have been automated, needs to be provided by hand. To me the equation seems to be that I rather have the automation get it 95% right, with 5% (possibly frustrating) fixup, than doing 100% by hand even if that gives me (non-frustrating) full control. I know that some people do not agree with that, which is why I DO want to have a CPPFFI, so that I can simply point them to that and let them have at it. It's their own feet, after all. :)
I don't have enough practical knowledge of C++ to judge how viable a "CPPFFI" would be.
C++11 is the big deal. I'm not saying that all of a sudden folks are going to write better quality code (not to mention the installed base), but the standards committee seems to have finally gotten it in their heads that if intentions are not specified in the interface, it creates problems for humans. I expect a lot of cleanup there, so that automation can grow from 95% right to 99% right or so, becoming a total no-brainer. For example, with move semantics, returning an std::vector from a function is almost as cheap as returning a pointer to an array, without any of the questions about object ownership. (And don't fall over that "almost": most likely the extra 8 bytes of the size data member that are copied over in the case of the vector sit on the same cache line anyway, with the highest cost being the pointer indirection in both cases.) Of course, on top of all that, there will always be folks who think that such automation produces something that "feels" too much like C++ and too little like python. And they're highly likely to be right. But designing such a python-like API is arguably more pleasant on top of C++-like python than on C++ directly. Anyway. :) As long as it is clear I didn't say "no", but said "after the cling backend is in." (And yes, given the delays, I know that reads a bit like "when pigs fly," but that's really not how I intend it.) Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

On Fri, Jan 24, 2014 at 12:41 PM, Armin Rigo <arigo@tunes.org> wrote:
Speaking as somebody who is currently using cppyy to build a Python binding for a large already-existing third-party C++ library, I have to say I think that moving cppyy in that direction would be a substantial step *backwards*, making cppyy substantially less valuable and useful. Please don't do this. The CFFI way of approaching things works well for C, because C was already designed as a pretty explicit language to begin with. The problem, however, is large portions of the C++ language itself are designed with the assumption that lots of implicit stuff will be done behind the scenes by the compiler to make it something a human can actually use. If you take that away, C++ becomes much, much more cumbersome and onerous (verging on completely unusable), and I believe any Python binding that takes that away would also be similarly cumbersome. The end result would be something where everyone would *have* to code a (fairly complex) extra Python interface layer for every single class they wanted to expose, just to make it usable in Python in a way that approximates how it's intended to be used in C++ (which as a secondary issue would probably make it all slower, to boot). In my opinion, the thing that currently makes cppyy one of the best cross-language-binding implementations I've ever seen is that it almost completely eliminates the need to do any of that. I was able to make a working binding (that works almost exactly the same way the C++ docs say it should) for somewhere over 90% of this roughly 150,000 line C++ library just by writing up a short 30-line XML file and running it through a couple of utilities to build it.. it's only the last 5-10% or so I'm having to work on now by hand and write extra interface logic for (and much of that promises to be fixed with cling). That is, frankly, amazingly valuable to me. In general, I'm a big fan of "explicit is better than implicit", but I do not believe that that always translates to cross-language bindings. The entire point of good language bindings is to do as much of the translation work automatically as possible so you don't have to write it all yourself every single time. If the language you're translating to is designed to do lots of things implicitly, it's really a practical necessity that the binding do the same, or it's really only doing half its job. (I agree that there are a few areas where cppyy could be better at *allowing* you to be explicit if you want to, but that does not sound like what you were proposing.. Apologies if I misunderstood anything.) --Alex

Hi Alex, On Fri, 24 Jan 2014, Alex Stewart wrote:
no, that's not the plan. The two approaches can live next to each other.
I agree that there are a few areas where cppyy could be better at *allowing* you to be explicit if you want to
Right, but I've never found an occasion where a C++ helper did not resolve such problems. And with Cling, these helpers can be embedded (CFFI-style), so that may be enough of a solution. (Dunno, is a matter of try-and-see.) Btw., just to show I have some code (and to allow anyone to comment if there are any takers), I've checked in some code that is the first step towards a Cling backend. Is still far from functional, but I have to start somewhere. Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

What about the CINT backend? https://bitbucket.org/pypy/pypy/src/52bbec630c1faec5ea0c1772872411de541d507f/pypy/module/cppyy/test/test_cint.py<https://bitbucket.org/pypy/pypy/src/52bbec630c1faec5ea0c1772872411de541d507f/pypy/module/cppyy/test/test_cint.py?at=default> On Thu, Jan 9, 2014 at 12:14 PM, Alex Stewart <foogod@gmail.com> wrote:
-- Ryan When your hammer is C++, everything begins to look like a thumb.

I'd looked around a bit but could only find vague references to CINT, and it wasn't even clear to me whether a full CINT backend really existed or it was just a hack/experiment. Is it actually suitable for general-purpose use? If so, I'd certainly be happy to try it.. how would one go about switching to using the CINT backend instead of Reflex? --Alex On Thu, Jan 9, 2014 at 3:22 PM, Ryan Gonzalez <rymg19@gmail.com> wrote:

On Fri, Jan 10, 2014 at 1:58 AM, Alex Stewart <foogod@gmail.com> wrote:
Hey Alex. On a sidenote - can you please subscribe to pypy-dev so I don't have to authorize every single one of your mail? It's relatively painless and it's also very low traffic, so you won't get much uninteresting stuff. Obviously feel free to unsubscribe once you're done. Cheers, fijal

On a sidenote - can you please subscribe to pypy-dev so I don't have to authorize every single one of your mail?
I'm a bit confused.. I've been subscribed to pypy-dev (and receiving list mail) since some time in August. (I just went to the mailman page for the list and logged in with my password and it seems to know who I am too..) I just tried unsubscribing and resubscribing, so maybe that will help clear things up? --Alex

Hi Alex,
it's quite alive; in high energy physics, Reflex is only used by mapping Reflex information into CINT, then use it from CINT. Is historic, though, and not recommended in general.
Is it actually suitable for general-purpose use?
If you're willing to install all of ROOT? (Or a minimal version anyway?) On the one hand, I'd argue against that; on the other, ROOT is available in many science sections of Linux distro's as well as in MacPorts, so it's not that big of a deal. But also the run-time dependencies increase. Anyway, the Reflex backend is the default precisely b/c it does not add any further dependencies. Also, CINT per se does not provide what you want (the code that allows compiling in extra parts is in ROOT proper).
If so, I'd certainly be happy to try it.. how would one go about switching to using the CINT backend instead of Reflex?
Is documented here, can only be done "builtin": http://doc.pypy.org/en/latest/cppyy_backend.html I never made a standalone libcppyy_backend library for CINT, as I don't expect there to be any use (physicists in HEP use by and large only releases provided by their experiments' software group; and CINT should be on its way out now that we have Cling largely working). Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

Hey Wim, Thanks for all the useful info on the status of things!
Yeah, I kinda figured this would be required regardless, and that bit isn't really that onerous.. it was mainly the extra cffi -> Python -> C++ -> Python round-trip I was hoping to be able to avoid somehow..
For global functions, I've found helper with void* + cffi (the way you've
described in your mail) enough. Yes, that isn't satisfactory, but last
year the priority has been Cling.
Entirely understandable (and the Cling stuff sounds like a better solution all round, so I agree with that priority). At least now I know I'm on the right track for an interim solution, and something better is actually on the way. :) As such, it all took about a year longer, but since end of November last
Out of curiosity, how much work do you expect the cppyy part of things to be? (I'd offer to help, but I suspect it's all way over my head..) This does also beg another question that's sorta been at the back of my mind for a bit: Since all of the underlying stuff gets done in CPython first, Is there any chance that at some point we might see a version of cppyy for CPython as well (with similar usage and dependencies)? It's not something I really need for this particular project, but it would be really cool if the Bullet library bindings I'm making (and potentially other C++ libraries) could eventually be used from either PyPy or CPython interchangeably.. I never made a standalone libcppyy_backend library for CINT, as I don't
That's ok, by the sound of things the Cling-based stuff is moving along well and I can probably wait around a bit for that instead. :) Anyway, thanks for the diligent work! I'm really looking forward to seeing the cling-cppyy (and will be happy to test, etc) when it's ready. --Alex

Hi Alex,
Out of curiosity, how much work do you expect the cppyy part of things to be? (I'd offer to help, but I suspect it's all way over my head..)
the work it self is mostly transliterating code. Cleaning it up and removing dependencies is a different matter.
That is the plan, as I have some folks here that are demanding it. As a complete aside ... I'm convinced that the version of cppyy with the loadable C-API could just as well be served with something that SWIG could produce. Not with the one that produces CPython extensions, but using the SWIG parser to generate something akin to Reflex dictionaries. That is not so far off as it may seem, as it does most of that already anyway, just cross-sprinkled with Python C-API calls that would need to be removed. The C-API that needs to be filled in for cppyy is really straightforward, and can be implemented incrementally. (I've added a dummy_backend.cxx to allow some tests to proceed even when genreflex is not installed and that proves the point sufficiently.) The API could also be modified, if need be. SWIG is not useful for us (it can't parse our headers), and some .py would still need to be generated (to provide the exact same interface as it does for CPython), but a 'swig -pypy' based on this should not be too hard. Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

Hi Ryan,
What about the CINT backend?
the way we "jit" code with CINT, is by generating a temporary file, then running the dictionary generator over it, and pulling everything through the C++ compiler. Works, but ... Also, the CINT backend carries more dependencies than strictly necessary, for historic reasons. Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

Hi Alex,
I have to say, I'm pretty impressed with how smoothly it all works!
good to hear.
Indeed. I've had two occasions (not with Reflex, but with CINT) where I needed to call the other way around: fitting of data distributions with python functions, and callbacks when C++ objects get deleted. I've used a custom piece of rpython for both. Overriding of virtual functions can only be done with a helper class deriving from the callback interface on the C++ side. There's no other (portable) way. For global functions, I've found helper with void* + cffi (the way you've described in your mail) enough. Yes, that isn't satisfactory, but last year the priority has been Cling. Which answers this:
The status is that we'd way underestimated the amount of work. When working with CINT or Reflex, it was always possible to have something functional, as parts that were not working yet, could be handled gracefully. Not so with Clang: if you don't handle something, it will exit one way or another (as it is supposed to do: a compiler stops on bad input). As such, it all took about a year longer, but since end of November last year, we now have a working bindings generator for Cling, but only on CPython. I've only just started (this week) with the groundwork of getting a back-end for cppyy into shape.
Calling from Cling into Python has always been part of it. From there, creating callable function pointers with a wrapper is not so much work. Derived classes is somewhat more convoluted, but also straightforward (has been done before, after all). (The bigger point of using Clang/LLVM was C++11, though, so that has some priority, although so far it seems to be in a rather good shape.) Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

For global functions, I've found helper with void* + cffi (the way you've described in your mail) enough.
Ok, so I think I'm really close to having something working now, except for one thing: I can't figure out how to access/pass a void pointer via cppyy? If I declare a C++ function/method as taking a "void *" argument, I can't figure out how to construct a Python object/value to pass to it that it will accept (whatever I try (for example, when trying an int, such as what cppyy.addressof returns), I just get "TypeError: 'CPPInstance' object expected, got 'int' instead").. Is there some special way to construct a cppyy "void *" object or something I'm not aware of? I also noticed when I tried defining a global "void *" variable, like so: void *voidp; That any attempt to access it from Python yields:
This feels like I must be missing something obvious here..? --Alex

So I actually worked around this problem by not using "void *" at all and passing around intptr_t values instead, which actually works better anyway since I realized I can pass them directly to/from cppyy.bind_object and cppyy.addressof (which makes the code come out much cleaner than I'd originally hoped, actually).. --Alex

Hi Alex, sorry for not responding earlier; had a bit of rough week at work.
So I actually worked around this problem by not using "void *" at all and passing around intptr_t values instead,
Yes, I was going to suggest that. :) But I'll first start implementing void* now. Later, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

sorry for not responding earlier; had a bit of rough week at work.
No worries.. I'm used to some lists where people can take a week or more to respond, so this is actually a refreshing change :) But I'll first start implementing void* now.
That would be awesome, if it's not too much trouble.. Using intptr_t is working fairly well for the callback stuff, but there are some other places in the library where void pointers are passed around (for example, assigning "user data" to some types of objects), which I've been trying to figure out how I would deal with if I couldn't use "void*", so it would definitely be useful.. Sorry to be a pest, but I've got one more question: Currently, for the callbacks, I'm converting the C++ object pointer to an intptr_t, and then on the Python side using cppyy.bind_object to convert that intptr_t back to an object of the specified type. I've noticed, however, that if I (for example) call a function which returns an object of BaseClass, but the object being returned actually happens to be a DerivedClass object, cppyy (somehow) figures this out and it comes through as a DerivedClass automatically in Python (so you don't need to cast it). If, however, the same object gets passed back via a callback (and I use cppyy.bind_object(ptr, BaseClass)), it comes out as a BaseClass regardless (so the Python code needs to know it's actually a DerivedClass somehow, and manually cast it). Is there some way I could do something like bind_object, but have it do the automagical "figure out what derived class this actually is and return that instead" behavior that cppyy can apparently do in other places? --Alex

Hi Alex,
That would be awesome, if it's not too much trouble..
well, void* is trouble by definition, but it has to be dealt with. Doing something with it, is actually not that hard, but getting it consistent is. Some code is in, on the reflex-support branch (to be merged once it is documented): void* is represented as an array (I didn't see a PyCObject or PyCapsule outside of cpyext), both for returns and data members. There is a special case cppyy.gbl.nullptr (C++11 style), which is a unique object to represent NULL. ATM, both int(0) and None can also be used in place of nullptr, but the next step will be to remove that (None is void, not NULL; and 0 is archaic). Is also just a matter of being consistent. addressof() has been changed to accept an array, to return its start address as an integer; bind_object() can take both the result of addressof() as well as the array directly.
Passing C++ instances through void* was already supported, and so that should be fine. There was an issue if the code started out with a void* (e.g. from a function call), but that now works as well. (That consistency thing again.)
Yes, this is based on C++'s (limited) RTTI: the dictionary stores class info both by (human readable) class name as well as by typeid. A pre-compiled (or generated, in the case of Cling) cast function to calculate the offset does the rest. (And if the inheritance is non-virtual, the JIT will elide all of that machinery.)
I've added a parameter 'cast' which can be set to True. Note that the call to bind_object already takes another boolean parameter, owns, so better use keyword args. Note that the class name and the pointer value still have to match (was already true before). There is one more caveat: if a void* pointer was first bound to a base class, and is later bound to a derived class (or vice versa), the memory regulator will not work, so any existing python object will not be reused. Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

(I'm a little unclear what it means to represent "void*" as an array (array of what?), but the rest of cppyy seems to be pretty well thought out, so I'll trust you know what you're doing :) ) Out of curiosity, is there some reason you didn't name that cppyy.gbl.NULL instead? It seems to me that "nullptr" could potentially cause namespace collisions if some library decided to have a function/variable by that name, whereas "NULL" would be consistent with standard C++ naming.. Passing C++ instances through void* was already supported, and so that
For clarification, the specific use case I'm looking at is objects/functions that use "void *" as an "opaque pointer to application data" that the library doesn't want/need to care about (just pass back to the application sometimes). In this context, it would be really nice if there was some easy way to use "void*" to represent a pure Python object as well (not just C++ instances). Obviously, in pypy objects don't have static addresses, so we can't just use a pointer to the Python object, but I was considering the possibility of taking an index or id() of the object and just casting it into "void*" form, and then converting it back and using it to look up objects in a list/dict somewhere.. (Alternately, it occurs to me now one could define a C++ "proxy" object which could store that sort of lookup information, and then the address of the proxy could be stored in the void*, which might be a little more kosher..) In any case, I wanted to make sure that use case was something that would be feasible in your plan for how void pointers will ultimately work.. --Alex

Hi Alex,
(I'm a little unclear what it means to represent "void*" as an array (array of what?),
exactly. A PyCObject is useless by design (but lacking in PyPy AFAICT, other than in cpyext, which use would be slow), as is an array of void. It is the best equivalent I could come up with ... I figured that the only thing I ever wanted from a PyCObject, is to get the C-side address. That is why cppyy.addressof() does provide that for arrays of void now. Beyond that, PyCObject's serve to go from C -> Python -> C, with no work in Python. This void array can do that. The only part missing is an as_cobject (one that returns a cpyext-based, CPython PyCObject), to communicate with other extension libraries that use the CPython C-API. Such a function is needed for all other C++ instances also, so I'll add that (PyROOT has it to allow integration with PyQt, for example).
Is unlikely, as 'nullptr' is a C++11 keyword. I've been thinking of putting it under 'cppyy' rather than 'cppyy.gbl', but the latter seemed to make more sense (it's not a module feature, but a C++ feature).
Okay, I understand better now.
Obviously, in pypy objects don't have static addresses, so we can't just use a pointer to the Python object
That is solvable (cppyy can bind functions that take/provide PyObject* as it simply hands over the work to cpyext). See for example test_crossing.py in pypy/module/cppyy/test. The larger problem would be that any randomly given python object does not have a guaranteed layout.
Alex Pyattaev did this (is in the pypy-dev mail archives, somewhere around early Summer 2012).
In any case, I wanted to make sure that use case was something that would be feasible in your plan for how void pointers will ultimately work..
Let me know if you see anything missing. "void*" is an interesting puzzle, without many people (that I know of) interested in it. The much bigger pain that we have, is the use of "char*" where the intended usage is "byte*". (And I don't actually have a solution for that.) Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

Hi, On Thu, Jan 23, 2014 at 10:36 PM, <wlavrijsen@lbl.gov> wrote:
This exists in CFFI (see ffi.new_handle()). More generally, wouldn't it make some sense to try to bring cppyy closer to cffi? It is a change of some of the basics, so only suitable for people without an existing code base. What I'm thinking of is if we could consider a cppyy-like interface as an extension of the cffi interface, like C++ is an extension of C, or if it doesn't really make sense. A bientôt, Armin.

Hi Armin,
More generally, wouldn't it make some sense to try to bring cppyy closer to cffi?
yes, it does. Is high on my wish list (and not just the programmer-facing side, also the internals although there are several things that are close but don't quite fit). At the top is cling, though, and time is lacking. Of course, if you know a volunteer who can help here? :) Thanks, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

Hi Wim, On Fri, Jan 24, 2014 at 6:28 PM, <wlavrijsen@lbl.gov> wrote:
I meant specifically the way the interface is to be used by the end programmer, ignoring its implementation for now. This would mean moving to more cffi-like idioms: removing the implicit ownership-tracking logic, not guessing too hard about which overloaded function is meant to be called, and so on --- generally "explicit is better than implicit" and if the end programmer needs to expose some library in a simpler interface, he can use a Python layer on top of it (or write his own). So this would end up somewhere slightly different than where cppyy is currently. A bientôt, Armin.

Hi Armin,
that is all possible, and largely there, just the other way around than:
By default, there is memory tracking, auto-casting, overloading, template instantiation (with cling; partially with cint), etc. And if that is not desired, a Python layer can be written to do things differently. E.g. to select a specific overload, use the "__dispatch__" function. I don't think that that will be workable, though. Maybe I should have bitten on the "like C++ is an extension of C" comment. :) Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

Hi Wim, On Fri, Jan 24, 2014 at 6:56 PM, <wlavrijsen@lbl.gov> wrote:
So you're basically answering "no" :-) As you're describing it, cppyy is at the "ctypes" level of providing automatic everything and lots of options and ways to work around them if they are not wanted. CFFI is at the opposite end of the spectrum on this respect. But that is for C. I don't have enough practical knowledge of C++ to judge how viable a "CPPFFI" would be. A bientôt, Armin.

Hi Armin,
So you're basically answering "no" :-)
no, I'm not saying no. (Okay, now I did twice. :) ) I did say that it was high on my wish list, after all. I know that there are many people who like to work the explicit way, which is why such an interface needs to be provided. And it can be done, as it already (mostly) exists, just scattered. Further, module level stuff already lives on 'cppyy.' and C++ stuff lives on 'cppyy.gbl'. So I think that this can be totally supported and faithfully follow CFFI, without clashing. I just think that it's a historic artefact that folks want this, not that they have good reasons for it. CFFI has an additional advantage, by not having heavy dependencies. Good luck writing your own pycppparser, though. :/ (Separate from ABI concerns.) So if with C++, a full C++ compiler is pulled into the mix anyway, why content with so much hand-coding?
As you're describing it, cppyy is at the "ctypes" level of providing automatic everything
There's nothing automatic about ctypes. :? What I mean is something like e.g. std::map<MyKey, MyValue>. Why would anyone want to re-invent the pythonizations, if one can just do: for key, value in my_cpp_map: # ... whatever directly from the automatic bindings and pythonizations provided by cppyy?
and lots of options and ways to work around them if they are not wanted
Right, but it's work either way? If no automation is provided, then that work that could have been automated, needs to be provided by hand. To me the equation seems to be that I rather have the automation get it 95% right, with 5% (possibly frustrating) fixup, than doing 100% by hand even if that gives me (non-frustrating) full control. I know that some people do not agree with that, which is why I DO want to have a CPPFFI, so that I can simply point them to that and let them have at it. It's their own feet, after all. :)
I don't have enough practical knowledge of C++ to judge how viable a "CPPFFI" would be.
C++11 is the big deal. I'm not saying that all of a sudden folks are going to write better quality code (not to mention the installed base), but the standards committee seems to have finally gotten it in their heads that if intentions are not specified in the interface, it creates problems for humans. I expect a lot of cleanup there, so that automation can grow from 95% right to 99% right or so, becoming a total no-brainer. For example, with move semantics, returning an std::vector from a function is almost as cheap as returning a pointer to an array, without any of the questions about object ownership. (And don't fall over that "almost": most likely the extra 8 bytes of the size data member that are copied over in the case of the vector sit on the same cache line anyway, with the highest cost being the pointer indirection in both cases.) Of course, on top of all that, there will always be folks who think that such automation produces something that "feels" too much like C++ and too little like python. And they're highly likely to be right. But designing such a python-like API is arguably more pleasant on top of C++-like python than on C++ directly. Anyway. :) As long as it is clear I didn't say "no", but said "after the cling backend is in." (And yes, given the delays, I know that reads a bit like "when pigs fly," but that's really not how I intend it.) Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

On Fri, Jan 24, 2014 at 12:41 PM, Armin Rigo <arigo@tunes.org> wrote:
Speaking as somebody who is currently using cppyy to build a Python binding for a large already-existing third-party C++ library, I have to say I think that moving cppyy in that direction would be a substantial step *backwards*, making cppyy substantially less valuable and useful. Please don't do this. The CFFI way of approaching things works well for C, because C was already designed as a pretty explicit language to begin with. The problem, however, is large portions of the C++ language itself are designed with the assumption that lots of implicit stuff will be done behind the scenes by the compiler to make it something a human can actually use. If you take that away, C++ becomes much, much more cumbersome and onerous (verging on completely unusable), and I believe any Python binding that takes that away would also be similarly cumbersome. The end result would be something where everyone would *have* to code a (fairly complex) extra Python interface layer for every single class they wanted to expose, just to make it usable in Python in a way that approximates how it's intended to be used in C++ (which as a secondary issue would probably make it all slower, to boot). In my opinion, the thing that currently makes cppyy one of the best cross-language-binding implementations I've ever seen is that it almost completely eliminates the need to do any of that. I was able to make a working binding (that works almost exactly the same way the C++ docs say it should) for somewhere over 90% of this roughly 150,000 line C++ library just by writing up a short 30-line XML file and running it through a couple of utilities to build it.. it's only the last 5-10% or so I'm having to work on now by hand and write extra interface logic for (and much of that promises to be fixed with cling). That is, frankly, amazingly valuable to me. In general, I'm a big fan of "explicit is better than implicit", but I do not believe that that always translates to cross-language bindings. The entire point of good language bindings is to do as much of the translation work automatically as possible so you don't have to write it all yourself every single time. If the language you're translating to is designed to do lots of things implicitly, it's really a practical necessity that the binding do the same, or it's really only doing half its job. (I agree that there are a few areas where cppyy could be better at *allowing* you to be explicit if you want to, but that does not sound like what you were proposing.. Apologies if I misunderstood anything.) --Alex

Hi Alex, On Fri, 24 Jan 2014, Alex Stewart wrote:
no, that's not the plan. The two approaches can live next to each other.
I agree that there are a few areas where cppyy could be better at *allowing* you to be explicit if you want to
Right, but I've never found an occasion where a C++ helper did not resolve such problems. And with Cling, these helpers can be embedded (CFFI-style), so that may be enough of a solution. (Dunno, is a matter of try-and-see.) Btw., just to show I have some code (and to allow anyone to comment if there are any takers), I've checked in some code that is the first step towards a Cling backend. Is still far from functional, but I have to start somewhere. Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net
participants (5)
-
Alex Stewart
-
Armin Rigo
-
Maciej Fijalkowski
-
Ryan Gonzalez
-
wlavrijsen@lbl.gov