[pypy-dev] Py_DecRef() in cpyext

Stefan Behnel stefan_ml at behnel.de
Sun Feb 26 12:31:26 CET 2012


Hi Armin,

Armin Rigo, 26.02.2012 11:09:
> On Sun, Feb 26, 2012 at 09:50, Stefan Behnel wrote:
>> Looking at Py_DecRef(), however, left me somewhat baffled. I would have
>> expected this to be the most intensively tuned function in all of cpyext,
>> but it even started with this comment: (...)
> 
> Indeed, it's an obvious starting place if we want to optimize cpyext
> (which did not occur at all so far).  You are welcome to try.

Looks like it's worth it.


> Note
> that the JIT has nothing to do here: we cannot JIT any code written in
> C

... obviously - and there's C compilers for that anyway.


> and it makes no sense to apply the JIT on a short RPython callback
> alone.

I can't see why that would be so. Just looking at Py_DecRef(), I can see
lots of places where cross-function runtime optimisations would make sense,
for example. I'm sure a C compiler will have a hard time finding the right
fast path in there.


> But because most of the code in module/cpyext/ is RPython
> code, it means it gets turned into equivalent C code statically

Interesting. Given PyPy's reputation of taking tons of resources to build,
I assume you apply WPA to the sources in order to map them to C? Then why
wouldn't I get better traces from gdb and valgrind for the generated code?
Is it just that the nightly builds lack debugging symbols?


> The first thing to try would be to rethink how the PyPy object and the
> PyObject are linked together.  Right now it's done with two (possibly
> weak) dictionaries, one for each direction.  We can at least improve
> the situation by having a normal field in the PyObject pointing back
> to the PyPy object.  This needs to be done carefully but can be done.

Based on my experience with lxml, such a change is absolutely worth any hassle.


> The issue is that the GC needs to know about this field.  It would
> probably require something like: allocate some GcArrays of PyObject
> structures (not pointers, directly PyObjects --- which have all the
> same size here, so it works).  Use something like 100 PyObject
> structures per GcArray, and collect all the GcArrays in a global list.
>  Use a freelist for dead entries.  If you allocate each GcArray as
> "non-movable", then you can take pointers to the PyObjects and pass
> them to C code.  As they are inside the regular GcArrays, they are
> GC-tracked and can contain a field that points back to the PyPy
> object.

Sounds like a good idea to me.

Stefan



More information about the pypy-dev mailing list