[pypy-dev] Automated binding generation (and maintenance)

Shaheed Haque srhaque at theiet.org
Wed Oct 11 05:29:38 EDT 2017


Hi Wim,

After reviewing your comments, I propose to check out rootcling. I
initially had some trouble using pip3 to install the newer code, but
that seems to have been resolved as of yesterday's 0.2.3 build. I did
notice one message during the install which seems to be benign, so I
mention it here merely in passing:

  Running command /usr/bin/python3 -u -c "import setuptools,
tokenize;__file__='/tmp/pip-build-spz01kkp/cppyy-backend/setup.py';f=getattr(tokenize,
'open', open)(__file__);code=f.read().replace('\r\n',
'\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d
/tmp/tmpe2h6yls0pip-wheel- --python-tag cp36
  running bdist_wheel
  running build
  running build_ext
  error: [Errno 2] No such file or directory: 'cling-config': 'cling-config'
error
  Failed building wheel for cppyy-backend
  Running setup.py clean for cppyy-backend

I'll no doubt be back with questions :-).

Thanks for all the good work, Shaheed



On 23 September 2017 at 06:24, Shaheed Haque <srhaque at theiet.org> wrote:
> Wim,
>
> Thanks for the detailed and thoughtful reply. I will digest and
> respond when I am properly back in circulation.
>
> On 15 September 2017 at 07:43,  <wlavrijsen at lbl.gov> wrote:
>> Shaheed,
>>
>>> Ah, I had not realised rootcling existed. I've seen that I can invoke
>>> it using Python version-specific paths...is this the correct way to
>>> invoke it:
>>>
>>> ROOTCLING=/usr/local/lib/python3.6/dist-packages/cppyy_backend
>>> LD_LIBRARY_PATH=$ROOTCLING/lib $ROOTCLING/bin/rootcling -h
>>
>>
>> Yes, and here's a description of the LinkDef.h format:
>>
>>
>> https://root.cern.ch/root/html/guides/users-guide/AddingaClass.html#the-linkdef.h-file
>>
>>> or is there a recommended wrapper?
>>
>>
>> No, but I'm going to add one for pip, same as I did for genreflex. I've
>> been fleshing out the backend generation, taken over from Anto:
>>
>>   https://bitbucket.org/wlav/cppyy-backend
>>
>> where all that can live. I'm told that I'll need rootcling anyway for
>> use of modules (see below).
>>
>>> I actually get some warnings and then the error:
>>
>>
>> Add this set of exclusions to the selection.xml:
>>
>> <exclusion>
>>    <class pattern="*thread_mutex*" />
>>    <class pattern="*new_allocator*" />
>>    <class pattern="*Alloc_hider*" />
>> </exclusion>
>>
>> Of course, the larger problem of pulling in these standard libs over and
>> over again is that it is a waste of cpu and memory, so I do want to see
>> the file_name attribute fixed. As it stands, I'd simply exclude:
>>
>>    <class pattern="std::*" />
>>    <class pattern="__gnu_cxx::*" />
>>
>> especially since they are already available by default. Note that those two
>> rules cover the ones needed for new_allocator and Alloc_hider.
>>
>> However, there is a more efficient approach that is right around the corner
>> (and has been right about the corner for a long time, so don't hold me to
>> that). Next release now seems likely though.
>>
>> The long term goal has always been to use modules:
>>
>>   http://clang.llvm.org/docs/Modules.html
>>
>> but the original drivers (Apple, Google, and the C++ standards committee)
>> have been going back and forth on it. Now, things are finally falling into
>> place. Here's Google:
>>
>>   https://www.youtube.com/watch?v=dHFNpBfemDI
>>
>> And here's ROOT:
>>
>>
>> https://indico.cern.ch/event/643728/contributions/2612822/attachments/1494074/2323893/ROOTs_C_modules_status_report.pdf
>>
>> The big deal is that C++ developers have an incentive to deploy modules, so
>> being able to patch into that should be a huge time saver (and where they
>> don't, rootcling will soon be able to create modules from headers). Note
>> that modules don't come for free: it will require some ambiguity resolution,
>> but that is typically a Good Thing (code-quality wise).
>>
>> Modules allow deserialization of only the piece of the AST that is actually
>> being requested, saving memory. This as opposed to header files (whether or
>> not precompiled) which pull in everything before them. See the status report
>> above for the improvements in memory usage.
>>
>> And with modules, of course, selection becomes unnecessary (markup for
>> automatic streamers may still be useful, but that is not relevant for
>> bindings generation).
>>
>>> I did wonder if I was missing some "-isystem" includes, and tried
>>> adding them but the --debug output from genreflex seemed to suggest
>>> they were being ignored.
>>
>>
>> Some flags are ignored as no-one was using them (so far). Some others
>> are definitely obsolete by now.
>>
>>> What is interesting, and might possibly throw light on the selection
>>> filter issue, is that the file name for the classes in
>>> kjsinterpreter.h itself is always the empty string ''. Classes that
>>> come from included files return non-empty strings such as
>>> 'kjsobject.h' for 'KJSObject'.
>>
>>
>> That's after the fact (i.e. what is stored); I don't see the rule being
>> respected/used at all.
>>
>>> BTW, the reason for doing this is that lots of KDE code has multiple
>>> classes and even namespaces in a single header file. Now, for
>>> discoverability of the loaded objects, I find the incremental "pop
>>> into cppyy,gbl on demand" somewhat limiting and I wanted to play about
>>> with that. I could also workaround the filter issue if I precomputed
>>> the needed names in a precursor pass.
>>
>>
>> The issue here is the memory cost of loading things that won't get used
>> in the end. This is why a functional dir() (which needs nothing but
>> strings, after all), in conjunction with lazy loading/creation when a
>> real access happens work well. LLVM is fully lookup based, btw. There
>> is a custom layer on top of Cling to make enumeration possible.
>>
>>> Finally, and most importantly given the fidelity with which cppyy
>>> renders the C++ code, I'm think about how Pythonisation customisation
>>> might be handled: e.g. a Python wrapper layer to allow a
>>> pointer-plus-size to render as a Python list/tuple, or generate a dict
>>> mapping fora QSet, and so on. (I'm dimly aware of the
>>> boost-recognition logic you have alluded to, this is specifically more
>>> about Qt-specific patterns and ad-hoc scenarios).
>>
>>
>> In 2015, a GSoC student fleshed this out. I never put it into PyPy b/c of
>> a lack of test coverage, but did put in in PyROOT. Here's an example of
>> the "pointer-plus-size" pythonization (from ROOT.py):
>>
>>     # python side pythonizations (should live in their own file, if we get
>> many)
>>       def set_size(self, buf):
>>          buf.SetSize(self.GetN())
>>          return buf
>>
>>     # TODO: add pythonization API to pypy-c
>>       if not PYPY_CPPYY_COMPATIBILITY_FIXME:
>>          cppyy.add_pythonization(
>>             cppyy.compose_method("^TGraph(2D)?$|^TGraph.*Errors$",
>> "GetE?[XYZ]$", set_size))
>>
>> The functions selected by the regexps return naked pointers, but the object
>> can be queried for the size (all have a consistent GetN() function). So the
>> method composer patches up the return value, making it a sized array,
>> instead of an "open-ended" one.
>>
>> I'm sitting on some patches as I wanted to tweak his APIs a bit. There
>> was some ordering that I felt didn't compose well, but that is minor.
>>
>> Similarly, there's code to apply ownership rules, mapping exceptions,
>> the new C++11 smartptrs, controlling auto-casting, handling the GIL, making
>> properties, and adding overloads. All driven by regexp matching of patterns.
>> See here:
>>
>>
>> https://bitbucket.org/wlav/cppyy/src/4d14ba325e494f13cc11f3f11cbb87b44048b256/python/cppyy/_pythonization.py?at=master
>>
>> (plus further support inside the bindings layer itself).
>>
>> Of course, one can hook up completely custom functions, and he made it so
>> that that is per C++ namespace, so nicely self-contained.
>>
>> Again, this is currently only partly available, as I need to write a lot
>> more tests for PyPy (which are bound to unearth some problems along the
>> way). And then there is documentation to be written ...
>>
>>> P.S. Please note that after today, I'll likely not have much Internet
>>> access for a couple of weeks, so any responses may be limited.
>>
>>
>> I'll make sure I have at least all my local changes pushed by then. :)
>>
>>
>> Best regards,
>>            Wim
>> --
>> WLavrijsen at lbl.gov    --    +1 (510) 486 6411    --    www.lavrijsen.net


More information about the pypy-dev mailing list