[pypy-dev] cpyext performance

Stefan Behnel stefan_ml at behnel.de
Thu Jul 5 14:35:29 CEST 2012


Maciej Fijalkowski, 05.07.2012 11:01:
> On Thu, Jul 5, 2012 at 10:26 AM, Amaury Forgeot d'Arc wrote:
>> 2012/7/5 Stefan Behnel
>>> Back to that question then:
>>>
>>>> Is there a way to get readable debugging symbols in a translated PyPy
>>>> that would tell me what is being executed?
>>
>> I fear that pypy standard distribution calls "strip" on the resulting
>> binary.
>> You could translate pypy yourself, I'm quite sure it contains debug info
>> already and it's quite easy to call "make debug" anyway.
>
> Default build (not the distribution or nightly, you have to trasnlate
> yourself), contains debug info.

Ah, yes. Given a suitably large machine and enough time to do other stuff,
that did the trick for me. Here's the result:

http://cython.org/callgrind-pypy-nbody.png

As you can see, make_ref() and Py_DecRef() combined account for almost 80%
of the runtime. So the expected gain from optimising the PyObject handling
is *huge*.

The first thing that pops up from the graph is the different calls through
generic_cpy_call(). That looks way to generic for something as performance
critical as Py_DecRef().

Also, what's that recursive "stackwalk()" thing doing?

Stefan



More information about the pypy-dev mailing list