
Hi Armin, Armin Rigo, 26.02.2012 11:09:
On Sun, Feb 26, 2012 at 09:50, Stefan Behnel wrote:
Looking at Py_DecRef(), however, left me somewhat baffled. I would have expected this to be the most intensively tuned function in all of cpyext, but it even started with this comment: (...)
Indeed, it's an obvious starting place if we want to optimize cpyext (which did not occur at all so far). You are welcome to try.
Looks like it's worth it.
Note that the JIT has nothing to do here: we cannot JIT any code written in C
... obviously - and there's C compilers for that anyway.
and it makes no sense to apply the JIT on a short RPython callback alone.
I can't see why that would be so. Just looking at Py_DecRef(), I can see lots of places where cross-function runtime optimisations would make sense, for example. I'm sure a C compiler will have a hard time finding the right fast path in there.
But because most of the code in module/cpyext/ is RPython code, it means it gets turned into equivalent C code statically
Interesting. Given PyPy's reputation of taking tons of resources to build, I assume you apply WPA to the sources in order to map them to C? Then why wouldn't I get better traces from gdb and valgrind for the generated code? Is it just that the nightly builds lack debugging symbols?
The first thing to try would be to rethink how the PyPy object and the PyObject are linked together. Right now it's done with two (possibly weak) dictionaries, one for each direction. We can at least improve the situation by having a normal field in the PyObject pointing back to the PyPy object. This needs to be done carefully but can be done.
Based on my experience with lxml, such a change is absolutely worth any hassle.
The issue is that the GC needs to know about this field. It would probably require something like: allocate some GcArrays of PyObject structures (not pointers, directly PyObjects --- which have all the same size here, so it works). Use something like 100 PyObject structures per GcArray, and collect all the GcArrays in a global list. Use a freelist for dead entries. If you allocate each GcArray as "non-movable", then you can take pointers to the PyObjects and pass them to C code. As they are inside the regular GcArrays, they are GC-tracked and can contain a field that points back to the PyPy object.
Sounds like a good idea to me. Stefan