[pypy-dev] Python FFI

wlavrijsen at lbl.gov wlavrijsen at lbl.gov
Wed May 16 19:32:07 CEST 2012


Hi Stefan,

> Hmm, are you sure? Pyrex is pretty old, too. Cython's C++ support certainly
> is a lot younger, though. Well, anyway...

we've been parsing C++ since C++ itself was a spring chicken. I don't know
when Pyrex started (and it doesn't seem to parse headers AFAICS?), but unless
you tell me it was the early nineties ... The Python bindings came later for
us (2002), and again, no other technology would have flown: we simply would
not have been using Python today in HEP if I'd had followed any other route.

> My point is that there are too many ways to extract the declarations of a
> C/C++ API already and each of them is tied to a specific tool and thus hard
> to reuse elsewhere.

This is true and most tools are completely insufficient for all tasks as C++
parsing is hard, so everyone always settles on a subset. CLang finally changes
this picture, which is why we are betting on that. Problem is inertia, and the
fact that the reflection info is also used for I/O (that was its original
purpose). Transitioning is painful, as not everyone here is convinced that
allowing the tech to read our data be an external tool is acceptable, given
that we're thinking on timescales of the order of decades, and you never know
when anything not in-house goes dodo-bird or is taken private.

>> Now it's just being re-used and re-implemented for PyPy in a way that fits
>> PyPy best.
>
> Ok, my bad then. Get in line with SWIG. ;)

Not sure I'm following that one: back in the day, we tried SWIG, but that
wasn't up to the task (it's much better today, but not yet good enough).

The point I'm making is that I could get the original CPython extension code
to work on PyPy. However, then the Python side is fast (b/c of the JIT), and
the C++ side is fast (b/c it's C++), leaving the bindings to become the bottle
neck and we'd have achieved very little. That same issue is true for SWIG.

With cppyy, which uses libffi underneath and thus readily lifts on the work
done by the core PyPy folks for ctypes, virtually all call overhead can be
removed. Only overhead left are a couple of guards. However, if the C++ body
isn't completely trivial, then OOO and hyperthreading happily munch those
guards on the side. Iow. on a modern CPU that isn't completely overcommitted,
the cost of calling from Python into C++ can be as low as the cost of calling
from C++ into C++. (And it's only a fraction more if overcommitted.)

That's the point of cppyy, not the parsing/extracting part. The parsing part
is always an external tool. Whether Reflex, CINT, Cling, or Cython++ when
that becomes available for use, makes no matter. Build it (to the same level
that any of those other tools are today), and I'll add a back-end.

Best regards,
            Wim
-- 
WLavrijsen at lbl.gov    --    +1 (510) 486 6411    --    www.lavrijsen.net


More information about the pypy-dev mailing list