[pypy-dev] your thoughts on low level optimizations

Thu Sep 1 13:59:14 CEST 2011

Hi Wim,

Thanks for the quick reply, this is very helpful information and in some
ways surprising. Let me just try to confirm that I got all this correctly so
that I am sure to draw the right conclusions.

First of all, to clarify, I understand that the overhead of calling into C
is not such a big deal if indeed the time spent in that call is orders of
magnitude longer. For instance, components like iterative linear solvers
would be of that kind, where the majority of work is done inside a single
call. But if I would need to implement a more numpy-array-like data type
then I suppose the overhead of the connected data manipulation calls is an
issue of concern.

I was not actually aware that ctypes is considered that efficient. Does this
apply to CPython as well? I always assumed that going by the Python API
would be the most direct, least overhead interface possible. If ctypes
provides an equally efficient interface for both CPython and PyPy then that
is certainly something I would consider using. By the way, you mention
ctypes *or* libffi as if they are two distinct options, but I believe ctypes
was built on top of libffi. Is it then possible, and is there reason, to use
libffi directly?

Perhaps too generic, but just to fire away all my questions for anyone to
comment on: what would be the recommended way to raise exceptions going
through ctypes; special return values, or is there maybe a function call
that can be intercepted? It's one of those things where I see advantages in
using the Python API (even though that is also based on simply returning
NULL, but then with the additional option of setting an exception state; an
intercepted function call would be *much* nicer, actually). #perfectworld

Back on topic, it surprised me, too, that RPython components are not
modular. Do I understand correctly that this means that, after making
modifications to the component, the entire PyPy interpreter needs to be
rebuilt? Considering the time involved that sounds like a big drawback,
although of course during development the same module could be left
untranslated. Are there plans to allow for independently translated modules?
Or is this somehow fundamentally impossible.

I must also admit that it is still not entirely clear to me what the precise
differences are between translated and non-translated code, as in both
situations the JIT compiler appears to be active. (Right? After all RPython
is still dynamically typed). Is there a good text that explains these PyPy
fundamentals a little bit more entry-level than the RPython Toolchain [1]
reference?

Lastly, you mention SWIG of equivalent (Boost?) as alternative options. But
don't these tools generate Python API code, and thus (in PyPy) rely on
cpyext? This 2008 sprint discussion [2] loosely suggests that there will be
no direct PyPy-ish implementation of these tools, and instead argues for
reflex, leading to this week's post. So I think if anything I should
consider that. Again, if I demonstrate any misconceptions please do correct
me.

I am not necessarily bound to existing code so I could decide to make the
switch from C to C++, but I would do so only if it offers clear advantages.
If reflex offers a one-to-one translation of C++ classes to Python then that
certainly sounds useful, but unless it is something that I could not equally
achieve by manual ctypes annotations I think I would prefer to keep things
under manual control, and keep the C library entirely independent. My
feelings are that that approach is the most future-proof, which is my
primary concern before efficiency.

Overall, not many direct questions, but I hope to be corrected if any of my
assertions are false, and of course I would still like to learn additional
arguments for or against possible approaches for low level optimization.

Thanks

Gertjan

[1] http://codespeak.net/pypy/dist/pypy/doc/translation.html
[2]
http://morepypy.blogspot.com/2008/10/sprint-discussions-c-library-bindings.html

PS @Wim, that's interesting. People tend to be a bit confused when I tell
them I went from earthquake research to printer ink. Now I can explain that
printer ink is just one step away from high energy particle physics.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20110901/5ee92709/attachment.html>