your thoughts on low level optimizations

Dear all, Following today's blog post [1] about wrapping C++ libraries, I would like to take the opportunity to get some opinions from the PyPy community on a related topic. I hope this mailing list is the appropriate place for this sort of discussion. My name is Gertjan van Zwieten and I have been a Python enthusiast since long. I have used it extensively in the context of my PhD project, which was related to Finite Element modelling, and I have recently taken steps together with a friend of mine towards continuing on this path commercially [2]. We intend to design our own Finite Element framework, which we aim to be open source, written largely in Python. It is the 'largely' that I would like to discuss here. I have always argued to people that, yes, it is possible to do computationally intensive things in an interpreted language, as long as the majority of work is done by compiled components. For me the obvious example of that is Numpy, which provides a rather specific data type and a set of vectorized operations, as well as an interface with optimized linear algebra libraries. This leaves Python as a convenient and very powerful glue language to connect these optimized components into something purposeful. With that conceptualization in mind, my preferred route for optimizing code was to identify critical components of generic nature, and re-implement them as a plain C module using the Python API. So this is how I used to look at things. I would say that things were nice and clear this way, until PyPy started throwing stones in the water. I have not so far used PyPy in any actual application, mainly (if not only) because of lacking Numpy support. But I can see it holds great promise, especially for the computational community (I am also eagerly following the transactional memory developments) and I anticipate making the switch after it matures a little bit further. However, that does put me in the situation that, currently, I am not sure anymore what things to implement in C, if any, and in what form. Right now I am prototyping our framework using only Numpy and pure Python. I avoid premature optimization as much as possible, and in fact at times find myself sacrificing speed for elegance, arguing that I will be able to bring back efficiency later on in a deeper layer. For example, rather than adopting a numbering scheme for the many computational elements such as is common in Finite Element implementations, I have moved to a more object oriented approach that is more flexible and allows for more error checking, but that forces me to put in dictionaries what used to be Numpy arrays. Obviously dictionaries were not meant to be numeric data structures, and it's a typical component that I would try to implement efficiently in C eventually. With PyPy at the horizon, I am not so sure anymore. For one I'm not sure if PyPy will ever be able to use the C module, or use it efficiently; I could understand if support exists merely for sake of compatibility. I am also not certain if I should maybe forget about C altogether and rely on the JIT to compile the Python for loops that I have always tried to avoid. Would you go as far as to say that there will be no more reason for low level programming whatsoever? Or would you advise to write the component in RPython and use the translator to compile it? With my poor overview of these things, there are very few arguments that I can formulate in favour or against any of these options. Regarding today's blog post, I have the feeling that this is meant more for wrapping existing C++ libraries than for starting new ones, is that correct? Or if not, and it is, in fact, an option to consider, will this be able to work in CPython, too? That would make the transition a bit easier, obviously. I am very interested what your views are on this topic of optimizations. Best regards and thanks a lot for working on PyPy! Gertjan [1] http://morepypy.blogspot.com/2011/08/wrapping-c-libraries-with-reflection.ht... [2] http://hvzengineering.nl

Hi Gertjan,
With PyPy at the horizon, I am not so sure anymore. For one I'm not sure if PyPy will ever be able to use the C module, or use it efficiently
it's only the crossing into and out of a C extension module through cpyext that is less than optimal. If that crossing does not happen often, the penalty of the cross-language call overhead does not matter. Another option that you have is to write an "extension" module in Python or RPython that just calls into the C code through ctypes or libffi. That will be efficient. Just keep the C module clean and simple: clear memory ownership rules, no use of CPython internals, callbacks only on the Python side, etc. That way, any extra code needed for glue is simple to write and maintain.
Whatever works best, they all have their pro's and cons. With C, you can share between PyPy and CPython and true low-level coding is by far the easiest in it. RPython allows you to provide hints to the JIT which may make a big difference in performance, but it isn't modular: RPython components become part of pypy-c after the translation. Programming in Python is the cleanest, shares easily, but performance behavior may be surprising, especially in tight loops. I realize that that isn't an answer, but I don't think there is a single one answer that is valid in all cases. Your original approach is most likely to still be best: stick to Python until you find a performance problem, only then look for and evaluate your other options.
Regarding today's blog post, I have the feeling that this is meant more for wrapping existing C++ libraries than for starting new ones, is that correct?
Well, of course you can still write a new C++ library and wrap that too. And in the process make it Python-bindings-friendly. :) But since your original code was C, ctypes is likely easier to use since you can code in the needed reflection info (you can also run Reflex over the C code and use that to generate the Python-side annotations in a build process).
Or if not, and it is, in fact, an option to consider, will this be able to work in CPython, too?
The code is based on an equivalent that exists for CPython. However, that is not a standalone package, but part of ROOT (http://root.cern.ch). It could be standalone, but I never saw the need: if you do not already have reflection information libraries ("dictionaries") as we did for I/O, then SWIG or an equivalent tool works just as well and is already available. Best regards, Wim P.S. I noticed on your web site that you're working on a project for Oce. I used to work there in times gone by. :) -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

Hi Wim, Thanks for the quick reply, this is very helpful information and in some ways surprising. Let me just try to confirm that I got all this correctly so that I am sure to draw the right conclusions. First of all, to clarify, I understand that the overhead of calling into C is not such a big deal if indeed the time spent in that call is orders of magnitude longer. For instance, components like iterative linear solvers would be of that kind, where the majority of work is done inside a single call. But if I would need to implement a more numpy-array-like data type then I suppose the overhead of the connected data manipulation calls is an issue of concern. I was not actually aware that ctypes is considered that efficient. Does this apply to CPython as well? I always assumed that going by the Python API would be the most direct, least overhead interface possible. If ctypes provides an equally efficient interface for both CPython and PyPy then that is certainly something I would consider using. By the way, you mention ctypes *or* libffi as if they are two distinct options, but I believe ctypes was built on top of libffi. Is it then possible, and is there reason, to use libffi directly? Perhaps too generic, but just to fire away all my questions for anyone to comment on: what would be the recommended way to raise exceptions going through ctypes; special return values, or is there maybe a function call that can be intercepted? It's one of those things where I see advantages in using the Python API (even though that is also based on simply returning NULL, but then with the additional option of setting an exception state; an intercepted function call would be *much* nicer, actually). #perfectworld Back on topic, it surprised me, too, that RPython components are not modular. Do I understand correctly that this means that, after making modifications to the component, the entire PyPy interpreter needs to be rebuilt? Considering the time involved that sounds like a big drawback, although of course during development the same module could be left untranslated. Are there plans to allow for independently translated modules? Or is this somehow fundamentally impossible. I must also admit that it is still not entirely clear to me what the precise differences are between translated and non-translated code, as in both situations the JIT compiler appears to be active. (Right? After all RPython is still dynamically typed). Is there a good text that explains these PyPy fundamentals a little bit more entry-level than the RPython Toolchain [1] reference? Lastly, you mention SWIG of equivalent (Boost?) as alternative options. But don't these tools generate Python API code, and thus (in PyPy) rely on cpyext? This 2008 sprint discussion [2] loosely suggests that there will be no direct PyPy-ish implementation of these tools, and instead argues for reflex, leading to this week's post. So I think if anything I should consider that. Again, if I demonstrate any misconceptions please do correct me. I am not necessarily bound to existing code so I could decide to make the switch from C to C++, but I would do so only if it offers clear advantages. If reflex offers a one-to-one translation of C++ classes to Python then that certainly sounds useful, but unless it is something that I could not equally achieve by manual ctypes annotations I think I would prefer to keep things under manual control, and keep the C library entirely independent. My feelings are that that approach is the most future-proof, which is my primary concern before efficiency. Overall, not many direct questions, but I hope to be corrected if any of my assertions are false, and of course I would still like to learn additional arguments for or against possible approaches for low level optimization. Thanks Gertjan [1] http://codespeak.net/pypy/dist/pypy/doc/translation.html [2] http://morepypy.blogspot.com/2008/10/sprint-discussions-c-library-bindings.h... PS @Wim, that's interesting. People tend to be a bit confused when I tell them I went from earthquake research to printer ink. Now I can explain that printer ink is just one step away from high energy particle physics.

Hi Gertjan, On Thu, Sep 1, 2011 at 1:59 PM, Gertjan van Zwieten <gertjanvanzwieten@gmail.com> wrote:
The meta-answer first: the problem is that it's still not completely clear to us which approach is the best. They all have benefits and drawbacks so far...
I was not actually aware that ctypes is considered that efficient. Does this apply to CPython as well?
No, that's the first messy part: ctypes code is very efficient on top of PyPy, at least after the JIT has kicked in. It is not fast on top of CPython.
I always assumed that going by the Python API would be the most direct, least overhead interface possible.
By this you probably mean the "CPython API"... The difference is important. The C-level API that you're talking about is really CPython's. PyPy can emulate it with the cpyext module, but this emulation is slow. Moreover, if you want to compare it with ctypes, the PyPy JIT gets ctypes *faster* than the CPython C API can ever be on top of CPython, because the latter needs to explicitly wrap and unwrap the Python objects.
The ctypes way to do things is to design the C library with a "normal" C API, usable from other C programs. From that point of view the correct thing is to return error codes, and to check them in pure Python, after the call to the ctypes function.
Yes. You should only build RPython modules if you have a specific reason to. One example is the numpy module: we want to build it in a special way so that the JIT can look inside and perform delayed computations "in bulk".
Considering the time involved that sounds like a big drawback
This is certainly a drawback, but it's not that big as it first seem. The RPython module must simply be well-tested as normal Python code first. Once it is complete and tested, then we translate it. It usually takes a few attempts to fix the typing issues, but once it's done, it usually works as expected (provided the tests were good in the first place).
Are there plans to allow for independently translated modules? Or is this somehow fundamentally impossible.
This is a question that comes back regularly. We don't have any plan, but there have been attempts. They have been mostly unsuccessful, however. From our point of view we can survive with the drawback, as it is actually not that big, and as we don't generally recommend to write RPython modules for everything.
No, precisely, RPython is not dynamically typed. It is also valid Python, and as such, it can be run non-translated; but at the same time, if it's valid RPython, then it can be translated together with the rest of the interpreter, and we get a statically-typed version of this RPython code turned into C code. This translation process works by assuming (and to a large extent, checking) that the RPython code is statically typed, or at least "statically typeable"...
Is there a good text that explains these PyPy fundamentals a little bit more entry-level than the RPython Toolchain [1] reference?
The architecture overview is oldish but still up-to-date: http://doc.pypy.org/en/latest/architecture.html
Lastly, you mention SWIG of equivalent (Boost?) as alternative options.
These are not really supported so far. It may be that some SWIG modules turn into C code that can be loaded by cpyext, but that doesn't work for Cython, for example. The case of Cython is instructive: Romain Guillebert is working right now on a way to take a Cython module and emit, not C code for the CPython API, but Python code using ctypes. This would give a way to "compile" any Cython module to plain Python that works both of PyPy and CPython (but which is only fast on PyPy). We haven't thought so far very deeply about SWIG. Reflex is another solution that is likely to work very nicely if you can rewrite your C module as a C++ module and use the Reflex-provided Python API extracted from the C++ module. Again, it's unclear if it's the "best" path, but it's definitely one path.
My feelings are that that approach is the most future-proof, which is my primary concern before efficiency.
I would say that in this case, keeping your module in C with a C-friendly API is the most future-proof solution I can think of. That means so far --- with today's tools --- that you need to wrap it twice, as a CPython C extension module and as a pure Python ctypes, in order to get good performance on both CPython and PyPy. We hope to be able to provide better answers in the future, like "wrap it with Cython and generate the two interfaces for CPython and PyPy automatically". A bientôt, Armin.

On 2011-09-01, at 1:03 PM, Armin Rigo wrote:
Will it be possible at some point to write modules for pypy in RPython without the need to rebuild the entire interpreter? This way, for instance, we could write an import hook to compile *.rpy files on demand to simplify distribution. -Yury

Hi Yury, On Thu, Sep 1, 2011 at 8:26 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Will it be possible at some point to write modules for pypy in RPython without the need to rebuild the entire interpreter?
I've added an answer to this Frequently Asked Question to https://bitbucket.org/pypy/pypy/raw/default/pypy/doc/faq.rst . Armin

Hi Armin
Thanks, that's a very helpful conclusion and actually a perfectly workable solution for now. I will keep a close eye on the blog for future developments in this area, and I certainly hope that I will be able to make the switch soon. Let me just tie this up by thanking you all for your extensive and very helpful replies. It's been a very enlightening discussion. Much obliged, Gertjan

Hi, On Thu, 1 Sep 2011, Armin Rigo wrote:
for most practical purposes, the rewriting of C -> C++ for wrapping purposes with Reflex would be a simple matter of: $ cat mycppheader.h extern "C" { #include "mycheader.h" } But using Reflex for C is overkill, given that no reflection information is absolutely needed. What can also be done, is to generate the reflection info as part of the build process, and use it to generate annotations for ctypes. Then put those in a pickle file and ship that. On Thu, 1 Sep 2011, Gertjan van Zwieten wrote:
By the way, you mention ctypes *or* libffi as if they are two distinct options, but I believe ctypes was built on top of libffi.
Yes, but what I meant in the same sentence was the pair of Python+ctypes and the pair RPython+libffi. Both are efficient as Armin already explained because once the JIT is warmed up, no wrapping/unwrapping is needed anymore.
Lastly, you mention SWIG of equivalent (Boost?) as alternative options.
I mentioned those on the CPython side as reasons why I've never chosen to make Reflex-based (or CINT-based, rather) bindings available as a standalone application. They take the same amount of work if reflection information is not generated yet (in our applications, the reflection info is already there for the I/O, so the end-user does not need to deal with that as they would if the choice had fallen on SWIG). I think a part of the discussion that is missing, is who the target is of the various tools and who ends up using the product: if I'm an end-user, installing binary Python extension modules from the package manager that comes with my OS, then cpyext is probably my best friend. But if I'm a developer of an extension module, like you are, I would not rely on it, and instead provide a solution that works best on both, and that could run on all Pythons from using ctypes to writing custom code.
There's a 2010 post in between, when work was started: http://morepypy.blogspot.com/2010/07/cern-sprint-report-wrapping-c-libraries... Work is progressing as time allows and there are some nice results, but it's not production quality yet. Getting there, though, as the list of available features shows. However, everytime I throw it at a large class library (large meaning thousands of classes), there's always something new to tackle so far.
That depends on your C++ classes. E.g. for calculations of offsets between a derived class and virtual base classes, some form of reflection information is absolutely needed. Best regards, Wim
Ah. :) Actually, Oce was a detour. A fun one where I learned a lot to be sure, but I did start out in HEP and astrophysics originally. -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

Hi Gert Jan. Let me clarify what I got from your question - does it make sense to write performance sensitive code in C, or would PyPy optimize loops well enough? If you want to use only PyPy, you can quite easily use numpy arrays to get a C-like performance. indeed, hakan ardo was able to run his video processing routines (using array.array instead of numpy.array, but that's not relevant) at almost C speed [1] and we'll get there at some point in not so distant future. Also numpy vector operations are already faster using PyPy than cpython (by stacking multiple operations in one go) and we're planning to implement SSE in some not-so-distant future. This is however, if you plan to use PyPy. Those kind of solutions don't work on CPython at all. [1] http://morepypy.blogspot.com/2011/07/realtime-image-processing-in-python.htm... I hope that answers your questions. Cheers, fijal

Hi Gertjan,
With PyPy at the horizon, I am not so sure anymore. For one I'm not sure if PyPy will ever be able to use the C module, or use it efficiently
it's only the crossing into and out of a C extension module through cpyext that is less than optimal. If that crossing does not happen often, the penalty of the cross-language call overhead does not matter. Another option that you have is to write an "extension" module in Python or RPython that just calls into the C code through ctypes or libffi. That will be efficient. Just keep the C module clean and simple: clear memory ownership rules, no use of CPython internals, callbacks only on the Python side, etc. That way, any extra code needed for glue is simple to write and maintain.
Whatever works best, they all have their pro's and cons. With C, you can share between PyPy and CPython and true low-level coding is by far the easiest in it. RPython allows you to provide hints to the JIT which may make a big difference in performance, but it isn't modular: RPython components become part of pypy-c after the translation. Programming in Python is the cleanest, shares easily, but performance behavior may be surprising, especially in tight loops. I realize that that isn't an answer, but I don't think there is a single one answer that is valid in all cases. Your original approach is most likely to still be best: stick to Python until you find a performance problem, only then look for and evaluate your other options.
Regarding today's blog post, I have the feeling that this is meant more for wrapping existing C++ libraries than for starting new ones, is that correct?
Well, of course you can still write a new C++ library and wrap that too. And in the process make it Python-bindings-friendly. :) But since your original code was C, ctypes is likely easier to use since you can code in the needed reflection info (you can also run Reflex over the C code and use that to generate the Python-side annotations in a build process).
Or if not, and it is, in fact, an option to consider, will this be able to work in CPython, too?
The code is based on an equivalent that exists for CPython. However, that is not a standalone package, but part of ROOT (http://root.cern.ch). It could be standalone, but I never saw the need: if you do not already have reflection information libraries ("dictionaries") as we did for I/O, then SWIG or an equivalent tool works just as well and is already available. Best regards, Wim P.S. I noticed on your web site that you're working on a project for Oce. I used to work there in times gone by. :) -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

Hi Wim, Thanks for the quick reply, this is very helpful information and in some ways surprising. Let me just try to confirm that I got all this correctly so that I am sure to draw the right conclusions. First of all, to clarify, I understand that the overhead of calling into C is not such a big deal if indeed the time spent in that call is orders of magnitude longer. For instance, components like iterative linear solvers would be of that kind, where the majority of work is done inside a single call. But if I would need to implement a more numpy-array-like data type then I suppose the overhead of the connected data manipulation calls is an issue of concern. I was not actually aware that ctypes is considered that efficient. Does this apply to CPython as well? I always assumed that going by the Python API would be the most direct, least overhead interface possible. If ctypes provides an equally efficient interface for both CPython and PyPy then that is certainly something I would consider using. By the way, you mention ctypes *or* libffi as if they are two distinct options, but I believe ctypes was built on top of libffi. Is it then possible, and is there reason, to use libffi directly? Perhaps too generic, but just to fire away all my questions for anyone to comment on: what would be the recommended way to raise exceptions going through ctypes; special return values, or is there maybe a function call that can be intercepted? It's one of those things where I see advantages in using the Python API (even though that is also based on simply returning NULL, but then with the additional option of setting an exception state; an intercepted function call would be *much* nicer, actually). #perfectworld Back on topic, it surprised me, too, that RPython components are not modular. Do I understand correctly that this means that, after making modifications to the component, the entire PyPy interpreter needs to be rebuilt? Considering the time involved that sounds like a big drawback, although of course during development the same module could be left untranslated. Are there plans to allow for independently translated modules? Or is this somehow fundamentally impossible. I must also admit that it is still not entirely clear to me what the precise differences are between translated and non-translated code, as in both situations the JIT compiler appears to be active. (Right? After all RPython is still dynamically typed). Is there a good text that explains these PyPy fundamentals a little bit more entry-level than the RPython Toolchain [1] reference? Lastly, you mention SWIG of equivalent (Boost?) as alternative options. But don't these tools generate Python API code, and thus (in PyPy) rely on cpyext? This 2008 sprint discussion [2] loosely suggests that there will be no direct PyPy-ish implementation of these tools, and instead argues for reflex, leading to this week's post. So I think if anything I should consider that. Again, if I demonstrate any misconceptions please do correct me. I am not necessarily bound to existing code so I could decide to make the switch from C to C++, but I would do so only if it offers clear advantages. If reflex offers a one-to-one translation of C++ classes to Python then that certainly sounds useful, but unless it is something that I could not equally achieve by manual ctypes annotations I think I would prefer to keep things under manual control, and keep the C library entirely independent. My feelings are that that approach is the most future-proof, which is my primary concern before efficiency. Overall, not many direct questions, but I hope to be corrected if any of my assertions are false, and of course I would still like to learn additional arguments for or against possible approaches for low level optimization. Thanks Gertjan [1] http://codespeak.net/pypy/dist/pypy/doc/translation.html [2] http://morepypy.blogspot.com/2008/10/sprint-discussions-c-library-bindings.h... PS @Wim, that's interesting. People tend to be a bit confused when I tell them I went from earthquake research to printer ink. Now I can explain that printer ink is just one step away from high energy particle physics.

Hi Gertjan, On Thu, Sep 1, 2011 at 1:59 PM, Gertjan van Zwieten <gertjanvanzwieten@gmail.com> wrote:
The meta-answer first: the problem is that it's still not completely clear to us which approach is the best. They all have benefits and drawbacks so far...
I was not actually aware that ctypes is considered that efficient. Does this apply to CPython as well?
No, that's the first messy part: ctypes code is very efficient on top of PyPy, at least after the JIT has kicked in. It is not fast on top of CPython.
I always assumed that going by the Python API would be the most direct, least overhead interface possible.
By this you probably mean the "CPython API"... The difference is important. The C-level API that you're talking about is really CPython's. PyPy can emulate it with the cpyext module, but this emulation is slow. Moreover, if you want to compare it with ctypes, the PyPy JIT gets ctypes *faster* than the CPython C API can ever be on top of CPython, because the latter needs to explicitly wrap and unwrap the Python objects.
The ctypes way to do things is to design the C library with a "normal" C API, usable from other C programs. From that point of view the correct thing is to return error codes, and to check them in pure Python, after the call to the ctypes function.
Yes. You should only build RPython modules if you have a specific reason to. One example is the numpy module: we want to build it in a special way so that the JIT can look inside and perform delayed computations "in bulk".
Considering the time involved that sounds like a big drawback
This is certainly a drawback, but it's not that big as it first seem. The RPython module must simply be well-tested as normal Python code first. Once it is complete and tested, then we translate it. It usually takes a few attempts to fix the typing issues, but once it's done, it usually works as expected (provided the tests were good in the first place).
Are there plans to allow for independently translated modules? Or is this somehow fundamentally impossible.
This is a question that comes back regularly. We don't have any plan, but there have been attempts. They have been mostly unsuccessful, however. From our point of view we can survive with the drawback, as it is actually not that big, and as we don't generally recommend to write RPython modules for everything.
No, precisely, RPython is not dynamically typed. It is also valid Python, and as such, it can be run non-translated; but at the same time, if it's valid RPython, then it can be translated together with the rest of the interpreter, and we get a statically-typed version of this RPython code turned into C code. This translation process works by assuming (and to a large extent, checking) that the RPython code is statically typed, or at least "statically typeable"...
Is there a good text that explains these PyPy fundamentals a little bit more entry-level than the RPython Toolchain [1] reference?
The architecture overview is oldish but still up-to-date: http://doc.pypy.org/en/latest/architecture.html
Lastly, you mention SWIG of equivalent (Boost?) as alternative options.
These are not really supported so far. It may be that some SWIG modules turn into C code that can be loaded by cpyext, but that doesn't work for Cython, for example. The case of Cython is instructive: Romain Guillebert is working right now on a way to take a Cython module and emit, not C code for the CPython API, but Python code using ctypes. This would give a way to "compile" any Cython module to plain Python that works both of PyPy and CPython (but which is only fast on PyPy). We haven't thought so far very deeply about SWIG. Reflex is another solution that is likely to work very nicely if you can rewrite your C module as a C++ module and use the Reflex-provided Python API extracted from the C++ module. Again, it's unclear if it's the "best" path, but it's definitely one path.
My feelings are that that approach is the most future-proof, which is my primary concern before efficiency.
I would say that in this case, keeping your module in C with a C-friendly API is the most future-proof solution I can think of. That means so far --- with today's tools --- that you need to wrap it twice, as a CPython C extension module and as a pure Python ctypes, in order to get good performance on both CPython and PyPy. We hope to be able to provide better answers in the future, like "wrap it with Cython and generate the two interfaces for CPython and PyPy automatically". A bientôt, Armin.

On 2011-09-01, at 1:03 PM, Armin Rigo wrote:
Will it be possible at some point to write modules for pypy in RPython without the need to rebuild the entire interpreter? This way, for instance, we could write an import hook to compile *.rpy files on demand to simplify distribution. -Yury

Hi Yury, On Thu, Sep 1, 2011 at 8:26 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Will it be possible at some point to write modules for pypy in RPython without the need to rebuild the entire interpreter?
I've added an answer to this Frequently Asked Question to https://bitbucket.org/pypy/pypy/raw/default/pypy/doc/faq.rst . Armin

Hi Armin
Thanks, that's a very helpful conclusion and actually a perfectly workable solution for now. I will keep a close eye on the blog for future developments in this area, and I certainly hope that I will be able to make the switch soon. Let me just tie this up by thanking you all for your extensive and very helpful replies. It's been a very enlightening discussion. Much obliged, Gertjan

Hi, On Thu, 1 Sep 2011, Armin Rigo wrote:
for most practical purposes, the rewriting of C -> C++ for wrapping purposes with Reflex would be a simple matter of: $ cat mycppheader.h extern "C" { #include "mycheader.h" } But using Reflex for C is overkill, given that no reflection information is absolutely needed. What can also be done, is to generate the reflection info as part of the build process, and use it to generate annotations for ctypes. Then put those in a pickle file and ship that. On Thu, 1 Sep 2011, Gertjan van Zwieten wrote:
By the way, you mention ctypes *or* libffi as if they are two distinct options, but I believe ctypes was built on top of libffi.
Yes, but what I meant in the same sentence was the pair of Python+ctypes and the pair RPython+libffi. Both are efficient as Armin already explained because once the JIT is warmed up, no wrapping/unwrapping is needed anymore.
Lastly, you mention SWIG of equivalent (Boost?) as alternative options.
I mentioned those on the CPython side as reasons why I've never chosen to make Reflex-based (or CINT-based, rather) bindings available as a standalone application. They take the same amount of work if reflection information is not generated yet (in our applications, the reflection info is already there for the I/O, so the end-user does not need to deal with that as they would if the choice had fallen on SWIG). I think a part of the discussion that is missing, is who the target is of the various tools and who ends up using the product: if I'm an end-user, installing binary Python extension modules from the package manager that comes with my OS, then cpyext is probably my best friend. But if I'm a developer of an extension module, like you are, I would not rely on it, and instead provide a solution that works best on both, and that could run on all Pythons from using ctypes to writing custom code.
There's a 2010 post in between, when work was started: http://morepypy.blogspot.com/2010/07/cern-sprint-report-wrapping-c-libraries... Work is progressing as time allows and there are some nice results, but it's not production quality yet. Getting there, though, as the list of available features shows. However, everytime I throw it at a large class library (large meaning thousands of classes), there's always something new to tackle so far.
That depends on your C++ classes. E.g. for calculations of offsets between a derived class and virtual base classes, some form of reflection information is absolutely needed. Best regards, Wim
Ah. :) Actually, Oce was a detour. A fun one where I learned a lot to be sure, but I did start out in HEP and astrophysics originally. -- WLavrijsen@lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net

Hi Gert Jan. Let me clarify what I got from your question - does it make sense to write performance sensitive code in C, or would PyPy optimize loops well enough? If you want to use only PyPy, you can quite easily use numpy arrays to get a C-like performance. indeed, hakan ardo was able to run his video processing routines (using array.array instead of numpy.array, but that's not relevant) at almost C speed [1] and we'll get there at some point in not so distant future. Also numpy vector operations are already faster using PyPy than cpython (by stacking multiple operations in one go) and we're planning to implement SSE in some not-so-distant future. This is however, if you plan to use PyPy. Those kind of solutions don't work on CPython at all. [1] http://morepypy.blogspot.com/2011/07/realtime-image-processing-in-python.htm... I hope that answers your questions. Cheers, fijal
participants (5)
-
Armin Rigo
-
Gertjan van Zwieten
-
Maciej Fijalkowski
-
wlavrijsen@lbl.gov
-
Yury Selivanov