Hi Wim, After reviewing your comments, I propose to check out rootcling. I initially had some trouble using pip3 to install the newer code, but that seems to have been resolved as of yesterday's 0.2.3 build. I did notice one message during the install which seems to be benign, so I mention it here merely in passing: Running command /usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-spz01kkp/cppyy-backend/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/tmpe2h6yls0pip-wheel- --python-tag cp36 running bdist_wheel running build running build_ext error: [Errno 2] No such file or directory: 'cling-config': 'cling-config' error Failed building wheel for cppyy-backend Running setup.py clean for cppyy-backend I'll no doubt be back with questions :-). Thanks for all the good work, Shaheed On 23 September 2017 at 06:24, Shaheed Haque <srhaque@theiet.org> wrote:
Wim,
Thanks for the detailed and thoughtful reply. I will digest and respond when I am properly back in circulation.
On 15 September 2017 at 07:43, <wlavrijsen@lbl.gov> wrote:
Shaheed,
Ah, I had not realised rootcling existed. I've seen that I can invoke it using Python version-specific paths...is this the correct way to invoke it:
ROOTCLING=/usr/local/lib/python3.6/dist-packages/cppyy_backend LD_LIBRARY_PATH=$ROOTCLING/lib $ROOTCLING/bin/rootcling -h
Yes, and here's a description of the LinkDef.h format:
https://root.cern.ch/root/html/guides/users-guide/AddingaClass.html#the-link...
or is there a recommended wrapper?
No, but I'm going to add one for pip, same as I did for genreflex. I've been fleshing out the backend generation, taken over from Anto:
https://bitbucket.org/wlav/cppyy-backend
where all that can live. I'm told that I'll need rootcling anyway for use of modules (see below).
I actually get some warnings and then the error:
Add this set of exclusions to the selection.xml:
<exclusion> <class pattern="*thread_mutex*" /> <class pattern="*new_allocator*" /> <class pattern="*Alloc_hider*" /> </exclusion>
Of course, the larger problem of pulling in these standard libs over and over again is that it is a waste of cpu and memory, so I do want to see the file_name attribute fixed. As it stands, I'd simply exclude:
<class pattern="std::*" /> <class pattern="__gnu_cxx::*" />
especially since they are already available by default. Note that those two rules cover the ones needed for new_allocator and Alloc_hider.
However, there is a more efficient approach that is right around the corner (and has been right about the corner for a long time, so don't hold me to that). Next release now seems likely though.
The long term goal has always been to use modules:
http://clang.llvm.org/docs/Modules.html
but the original drivers (Apple, Google, and the C++ standards committee) have been going back and forth on it. Now, things are finally falling into place. Here's Google:
https://www.youtube.com/watch?v=dHFNpBfemDI
And here's ROOT:
https://indico.cern.ch/event/643728/contributions/2612822/attachments/149407...
The big deal is that C++ developers have an incentive to deploy modules, so being able to patch into that should be a huge time saver (and where they don't, rootcling will soon be able to create modules from headers). Note that modules don't come for free: it will require some ambiguity resolution, but that is typically a Good Thing (code-quality wise).
Modules allow deserialization of only the piece of the AST that is actually being requested, saving memory. This as opposed to header files (whether or not precompiled) which pull in everything before them. See the status report above for the improvements in memory usage.
And with modules, of course, selection becomes unnecessary (markup for automatic streamers may still be useful, but that is not relevant for bindings generation).
I did wonder if I was missing some "-isystem" includes, and tried adding them but the --debug output from genreflex seemed to suggest they were being ignored.
Some flags are ignored as no-one was using them (so far). Some others are definitely obsolete by now.
What is interesting, and might possibly throw light on the selection filter issue, is that the file name for the classes in kjsinterpreter.h itself is always the empty string ''. Classes that come from included files return non-empty strings such as 'kjsobject.h' for 'KJSObject'.
That's after the fact (i.e. what is stored); I don't see the rule being respected/used at all.
BTW, the reason for doing this is that lots of KDE code has multiple classes and even namespaces in a single header file. Now, for discoverability of the loaded objects, I find the incremental "pop into cppyy,gbl on demand" somewhat limiting and I wanted to play about with that. I could also workaround the filter issue if I precomputed the needed names in a precursor pass.
The issue here is the memory cost of loading things that won't get used in the end. This is why a functional dir() (which needs nothing but strings, after all), in conjunction with lazy loading/creation when a real access happens work well. LLVM is fully lookup based, btw. There is a custom layer on top of Cling to make enumeration possible.
Finally, and most importantly given the fidelity with which cppyy renders the C++ code, I'm think about how Pythonisation customisation might be handled: e.g. a Python wrapper layer to allow a pointer-plus-size to render as a Python list/tuple, or generate a dict mapping fora QSet, and so on. (I'm dimly aware of the boost-recognition logic you have alluded to, this is specifically more about Qt-specific patterns and ad-hoc scenarios).
In 2015, a GSoC student fleshed this out. I never put it into PyPy b/c of a lack of test coverage, but did put in in PyROOT. Here's an example of the "pointer-plus-size" pythonization (from ROOT.py):
# python side pythonizations (should live in their own file, if we get many) def set_size(self, buf): buf.SetSize(self.GetN()) return buf
# TODO: add pythonization API to pypy-c if not PYPY_CPPYY_COMPATIBILITY_FIXME: cppyy.add_pythonization( cppyy.compose_method("^TGraph(2D)?$|^TGraph.*Errors$", "GetE?[XYZ]$", set_size))
The functions selected by the regexps return naked pointers, but the object can be queried for the size (all have a consistent GetN() function). So the method composer patches up the return value, making it a sized array, instead of an "open-ended" one.
I'm sitting on some patches as I wanted to tweak his APIs a bit. There was some ordering that I felt didn't compose well, but that is minor.
Similarly, there's code to apply ownership rules, mapping exceptions, the new C++11 smartptrs, controlling auto-casting, handling the GIL, making properties, and adding overloads. All driven by regexp matching of patterns. See here:
https://bitbucket.org/wlav/cppyy/src/4d14ba325e494f13cc11f3f11cbb87b44048b25...
(plus further support inside the bindings layer itself).
Of course, one can hook up completely custom functions, and he made it so that that is per C++ namespace, so nicely self-contained.
Again, this is currently only partly available, as I need to write a lot more tests for PyPy (which are bound to unearth some problems along the way). And then there is documentation to be written ...
P.S. Please note that after today, I'll likely not have much Internet access for a couple of weeks, so any responses may be limited.
I'll make sure I have at least all my local changes pushed by then. :)
Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net