GSoC 2015: cpyext project?

Hi all, I've posted a couple of times on here before: I maintain a Python extension for GPGPU linear algebra[1], but it uses boost.python. I do most of my scientific computing in Python, but often am forced to use CPython where I would prefer to use PyPy, largely because of the availability of extensions. I'm looking for an interesting Google Summer of Code project for next year, and would like to continue working on things that help make high-performance computing in Python straight-forward. In particular, I've had my eye on the 'optimising cpyext'[2] project for a while: might work in that area be available? I notice that it is described with difficulty 'hard', and so I'm keen to enquire early so that I can get up to speed before making a potential application in the spring. I would love to work on getting cpyext into a good enough shape that both Cython and Boost.Python extensions are functional with minimal effort on behalf of the user. Does anyone have any advice? Are there particular things I should familiarise myself with? I know there is the module/cpyext tree, but it is quite formidable for someone uninitiated! Of course, I recognise that cpyext is a much trickier proposition in comparison with things like cffi and cppyy. In particular, I'm very excited by cppyy and PyCling, but they seem quite bound up in CERN's ROOT infrastructure, which is a shame. But it's also clear that very many useful extensions currently use the CPython API, and so -- as I have often found -- the apparent relative immaturity of cpyext keeps people away from PyPy, which is also a shame! [1] https://pypi.python.org/pypi/pyviennacl [2] https://bitbucket.org/pypy/pypy/wiki/GSOC%202014 Best, Toby -- Toby St Clere Smithe http://tsmithe.net

Hello Toby, Overall it's a nice goal, but I don't think that improving cpyext is easy. Its goal is to reproduce the CPython API, in all its details and caveats. I will list some of them to explain why I think it's a difficult task: - First, PyPy objects have no fixed layout exposed to C code. for example, PyPy has multiple implementations of lists and dicts, that are chosen at runtime, and can even change when the object is mutated, so all the concrete functions of the CPython API need to use the abstract object interface (e.g. PyList_GET_ITEM is not a C macro, but a Python call to type(x).__getitem__, fetched from the class dictionary) - Then, PyPy uses a moving garbage collector, which move allocated objects when they survive the first collection. This is not what users of PyObject* pointers expect, the address has to stay the same for the life of the object. So cpyext allocates a PyObject struct at a fixed address, and uses a mapping (expensive!) each time the object crosses the boundary between the interpreted and the C extension. There is even a ob_refcount field, which keeps track of the number of the references held in C code; and borrowed references were a nightmare to implement correctly. And I'm sure we don't correctly handle circular references between PyObjects... - Finally, there is a lot of code that directly accesses C struct members (very common: obj->ob_type->tp_name). So each time an object goes from Python to the C extension, cpyext needs to allocate a struct which contains all these fields, recursively, only to delete them when the call returns, even when the C code does not actually use these fields. Even if cpyext can be made a bit faster, the issues above won't disappear, if we want to support all the semantics implied by the CPython API. And believe me, all the features we implemented are needed by one extension or another. I'd say that cpyext is quite mature, because it provides all the infrastructure to support almost all extension modules, and went much farther than we initially expected. But I think it went as far as possible given the differences between CPython and PyPy. There is a solution though, which is also a nice project: Since "cffi" is the preferred way to access C code from PyPy, you could instead write a version of boost::python (maybe renamed to boost::python_cffi) that uses cffi primitives to implement all the boost functions: class_(), def(), and so on. I started this idea some time ago already, and I was able to support the "hello world" example of boost::python. This one: http://www.boost.org/doc/libs/1_57_0/libs/python/doc/tutorial/doc/html/index... I need to find the code I wrote so I can share it (around 250 lines); basically it's a rewrite of boost::python, but using a slightly different C API (to use Python features from C++), and a completely different way to manage memory (similar to JNI: there are Local and Global References <http://www.science.uva.nl/ict/ossdocs/java/tutorial/native1.1/implementing/r...>, and ffi.new_handle() to create references from objects). This method is much more friendly to PyPy and its JIT (mostly because references don't need to be memory addresses!) Or maybe you'll find that boost::python is quite complex to reimplement correctly (because it's boost), and you will decide to use directly the C API defined above. I remember there are functions like Object_SetAttrString and PyString_FromString, and it's easy to add new ones. Of course this requires to rewrite all your bindings from scratch, but since all the code will be in Python (with snippets of C++) you will find that there are better way than C++ templates to generate code from regular patterns. I haven't seen yet any serious module that uses cffi to interface C++, so any progress in this direction would be awesome. 2014-11-28 20:13 GMT+01:00 Toby St Clere Smithe <mail@tsmithe.net>:
-- Amaury Forgeot d'Arc

Maciej Fijalkowski <fijall@gmail.com> writes:
Unbounding cppyy from the CERN ROOT infrastructure sounds like a very worthy goal. Does that sound exciting to you?
That does sound worthwhile -- and probably more viable than Amaury's project (sorry, Amaury!). I've actually put out enquiries to the CERN people about a very similar idea, just relating to PyCling -- which is a more general cousin of cppyy, from what I can tell -- and so perhaps I could combine your expertise with theirs. I believe Wim is on this list; I've sent him and the CERN guys an e-mail this evening, but would like to hear back from them before expending much effort thinking about cppyy (since it is all so inter-related). Obviously, my preference would be to work on a project that would help both the CPython and PyPy worlds. Best, Toby -- Toby St Clere Smithe http://tsmithe.net

Maciej Fijalkowski <fijall@gmail.com> writes:
feel free to come to IRC and discuss it btw
Great -- I will pop in when I know more! Toby
-- Toby St Clere Smithe http://tsmithe.net

Toby,
I'll just quickly answer here first, get into more detail later on the private e-mail (although that won't be for today anymore). We actually have cppyy on CPython with Cling as a backend. The nice thing about a C++ interpreter is that you can do things like: $ python Python 2.7.7 (default, Jun 20 2014, 13:47:02) [GCC] on linux2 Type "help", "copyright", "credits" or "license" for more information. using my private settings ...
Clearly, it then also allows to build a 'cppffi' as Armin has asked for. The catch is that there is a boat load of refactoring to be done. The heavy lifting in the above module is in libCore, libCling, and libPyROOT, for example, which are all part of ROOT. (cppyy in PyPy is properly factored.) When we refer to 'PyCling', we mean the above, but refactored. To first order, that can be done by stripping all ROOT bits out of PyROOT, but better would be that it utilizes the same backend as does cppyy in PyPy. (You can also use the AST directly, in theory, leaving only clang/llvm as dependency, but we tried that, but it doesn't work. I can get you all the gory details.) There is more fun to be had then that, though. E.g. cppffi as already mentioned. But beyond, fully automatically generated bindings get you 95% of the way only. Yes, you get everything bound, but it smells like C++ and is sometimes clunky. Pythonizations get you to 99%, e.g. the above session can be continued like so:
b/c the PyROOT code recognizes the begin()/end() iterator paradigm. Smart, reflection-based pythonizations are a project in themselves. Then to get to 100%, requires some proper hooks for the programmer to fine tune behavior, and although PyROOT has some of that, it's rather ad hoc (e.g. settings for memory ownership and GIL handling) and I've never taken to time to think that through, so that could be another fun project. As said, I'll get to the other e-mail tomorrow. Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

Dear Wim, wlavrijsen@lbl.gov writes:
I'll just quickly answer here first, get into more detail later on the private e-mail (although that won't be for today anymore).
Sure. My comments here will also be mostly general (regarding what you've written below). I've suspected that most of what you write is the case, so thanks for clarifying that -- in particular regarding cppyy/PyCling/refactoring (this is fairly clear once you play with the various parts!). I also agree that is it very interesting, and I've also been wondering about the automated Pythonization bits; I agree that the automated bindings just give you something clunky. From my own point of view, I would again be very keen to work on something like that -- not only because it sounds rather fun, but also because it would save me a lot of work regarding PyViennaCL in future! I'm quite interested in the gory details and the fine-tuning, and think I could make quite a good GSoC proposal about this. It's also quite convenient that there is scope to work on it under the aegis of both PyPy and CERN, since that maximises the chances of organisation acceptance (not that it is likely, it seems to me, that either would be rejected). I look forwad to your e-mail tomorrow! Best, Toby
-- Toby St Clere Smithe http://tsmithe.net

Hello Toby, Overall it's a nice goal, but I don't think that improving cpyext is easy. Its goal is to reproduce the CPython API, in all its details and caveats. I will list some of them to explain why I think it's a difficult task: - First, PyPy objects have no fixed layout exposed to C code. for example, PyPy has multiple implementations of lists and dicts, that are chosen at runtime, and can even change when the object is mutated, so all the concrete functions of the CPython API need to use the abstract object interface (e.g. PyList_GET_ITEM is not a C macro, but a Python call to type(x).__getitem__, fetched from the class dictionary) - Then, PyPy uses a moving garbage collector, which move allocated objects when they survive the first collection. This is not what users of PyObject* pointers expect, the address has to stay the same for the life of the object. So cpyext allocates a PyObject struct at a fixed address, and uses a mapping (expensive!) each time the object crosses the boundary between the interpreted and the C extension. There is even a ob_refcount field, which keeps track of the number of the references held in C code; and borrowed references were a nightmare to implement correctly. And I'm sure we don't correctly handle circular references between PyObjects... - Finally, there is a lot of code that directly accesses C struct members (very common: obj->ob_type->tp_name). So each time an object goes from Python to the C extension, cpyext needs to allocate a struct which contains all these fields, recursively, only to delete them when the call returns, even when the C code does not actually use these fields. Even if cpyext can be made a bit faster, the issues above won't disappear, if we want to support all the semantics implied by the CPython API. And believe me, all the features we implemented are needed by one extension or another. I'd say that cpyext is quite mature, because it provides all the infrastructure to support almost all extension modules, and went much farther than we initially expected. But I think it went as far as possible given the differences between CPython and PyPy. There is a solution though, which is also a nice project: Since "cffi" is the preferred way to access C code from PyPy, you could instead write a version of boost::python (maybe renamed to boost::python_cffi) that uses cffi primitives to implement all the boost functions: class_(), def(), and so on. I started this idea some time ago already, and I was able to support the "hello world" example of boost::python. This one: http://www.boost.org/doc/libs/1_57_0/libs/python/doc/tutorial/doc/html/index... I need to find the code I wrote so I can share it (around 250 lines); basically it's a rewrite of boost::python, but using a slightly different C API (to use Python features from C++), and a completely different way to manage memory (similar to JNI: there are Local and Global References <http://www.science.uva.nl/ict/ossdocs/java/tutorial/native1.1/implementing/r...>, and ffi.new_handle() to create references from objects). This method is much more friendly to PyPy and its JIT (mostly because references don't need to be memory addresses!) Or maybe you'll find that boost::python is quite complex to reimplement correctly (because it's boost), and you will decide to use directly the C API defined above. I remember there are functions like Object_SetAttrString and PyString_FromString, and it's easy to add new ones. Of course this requires to rewrite all your bindings from scratch, but since all the code will be in Python (with snippets of C++) you will find that there are better way than C++ templates to generate code from regular patterns. I haven't seen yet any serious module that uses cffi to interface C++, so any progress in this direction would be awesome. 2014-11-28 20:13 GMT+01:00 Toby St Clere Smithe <mail@tsmithe.net>:
-- Amaury Forgeot d'Arc

Maciej Fijalkowski <fijall@gmail.com> writes:
Unbounding cppyy from the CERN ROOT infrastructure sounds like a very worthy goal. Does that sound exciting to you?
That does sound worthwhile -- and probably more viable than Amaury's project (sorry, Amaury!). I've actually put out enquiries to the CERN people about a very similar idea, just relating to PyCling -- which is a more general cousin of cppyy, from what I can tell -- and so perhaps I could combine your expertise with theirs. I believe Wim is on this list; I've sent him and the CERN guys an e-mail this evening, but would like to hear back from them before expending much effort thinking about cppyy (since it is all so inter-related). Obviously, my preference would be to work on a project that would help both the CPython and PyPy worlds. Best, Toby -- Toby St Clere Smithe http://tsmithe.net

Maciej Fijalkowski <fijall@gmail.com> writes:
feel free to come to IRC and discuss it btw
Great -- I will pop in when I know more! Toby
-- Toby St Clere Smithe http://tsmithe.net

Toby,
I'll just quickly answer here first, get into more detail later on the private e-mail (although that won't be for today anymore). We actually have cppyy on CPython with Cling as a backend. The nice thing about a C++ interpreter is that you can do things like: $ python Python 2.7.7 (default, Jun 20 2014, 13:47:02) [GCC] on linux2 Type "help", "copyright", "credits" or "license" for more information. using my private settings ...
Clearly, it then also allows to build a 'cppffi' as Armin has asked for. The catch is that there is a boat load of refactoring to be done. The heavy lifting in the above module is in libCore, libCling, and libPyROOT, for example, which are all part of ROOT. (cppyy in PyPy is properly factored.) When we refer to 'PyCling', we mean the above, but refactored. To first order, that can be done by stripping all ROOT bits out of PyROOT, but better would be that it utilizes the same backend as does cppyy in PyPy. (You can also use the AST directly, in theory, leaving only clang/llvm as dependency, but we tried that, but it doesn't work. I can get you all the gory details.) There is more fun to be had then that, though. E.g. cppffi as already mentioned. But beyond, fully automatically generated bindings get you 95% of the way only. Yes, you get everything bound, but it smells like C++ and is sometimes clunky. Pythonizations get you to 99%, e.g. the above session can be continued like so:
b/c the PyROOT code recognizes the begin()/end() iterator paradigm. Smart, reflection-based pythonizations are a project in themselves. Then to get to 100%, requires some proper hooks for the programmer to fine tune behavior, and although PyROOT has some of that, it's rather ad hoc (e.g. settings for memory ownership and GIL handling) and I've never taken to time to think that through, so that could be another fun project. As said, I'll get to the other e-mail tomorrow. Best regards, Wim -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

Dear Wim, wlavrijsen@lbl.gov writes:
I'll just quickly answer here first, get into more detail later on the private e-mail (although that won't be for today anymore).
Sure. My comments here will also be mostly general (regarding what you've written below). I've suspected that most of what you write is the case, so thanks for clarifying that -- in particular regarding cppyy/PyCling/refactoring (this is fairly clear once you play with the various parts!). I also agree that is it very interesting, and I've also been wondering about the automated Pythonization bits; I agree that the automated bindings just give you something clunky. From my own point of view, I would again be very keen to work on something like that -- not only because it sounds rather fun, but also because it would save me a lot of work regarding PyViennaCL in future! I'm quite interested in the gory details and the fine-tuning, and think I could make quite a good GSoC proposal about this. It's also quite convenient that there is scope to work on it under the aegis of both PyPy and CERN, since that maximises the chances of organisation acceptance (not that it is likely, it seems to me, that either would be rejected). I look forwad to your e-mail tomorrow! Best, Toby
-- Toby St Clere Smithe http://tsmithe.net
participants (4)
-
Amaury Forgeot d'Arc
-
Maciej Fijalkowski
-
Toby St Clere Smithe
-
wlavrijsen@lbl.gov