[pypy-dev] cpyext performance

Stefan Behnel stefan_ml at behnel.de
Thu Aug 23 11:25:21 CEST 2012


Maciej Fijalkowski, 23.08.2012 11:17:
> On Thu, Aug 23, 2012 at 11:11 AM, Stefan Behnel <stefan_ml at behnel.de> wrote:
>> Maciej Fijalkowski, 05.07.2012 14:50:
>>> On Thu, Jul 5, 2012 at 2:35 PM, Stefan Behnel wrote:
>>>> Maciej Fijalkowski, 05.07.2012 11:01:
>>>>> On Thu, Jul 5, 2012 at 10:26 AM, Amaury Forgeot d'Arc wrote:
>>>>>> 2012/7/5 Stefan Behnel
>>>>>>> Back to that question then:
>>>>>>>
>>>>>>>> Is there a way to get readable debugging symbols in a translated PyPy
>>>>>>>> that would tell me what is being executed?
>>>>>>
>>>>>> I fear that pypy standard distribution calls "strip" on the resulting
>>>>>> binary.
>>>>>> You could translate pypy yourself, I'm quite sure it contains debug info
>>>>>> already and it's quite easy to call "make debug" anyway.
>>>>>
>>>>> Default build (not the distribution or nightly, you have to trasnlate
>>>>> yourself), contains debug info.
>>>>
>>>> Ah, yes. Given a suitably large machine and enough time to do other stuff,
>>>> that did the trick for me. Here's the result:
>>>>
>>>> http://cython.org/callgrind-pypy-nbody.png
>>>>
>>>> As you can see, make_ref() and Py_DecRef() combined account for almost 80%
>>>> of the runtime. So the expected gain from optimising the PyObject handling
>>>> is *huge*.
>>>>
>>>> The first thing that pops up from the graph is the different calls through
>>>> generic_cpy_call(). That looks way to generic for something as performance
>>>> critical as Py_DecRef().
>>>>
>>>> Also, what's that recursive "stackwalk()" thing doing?
>>>
>>> Haha :)
>>>
>>> This is related to garbage collection - it scans the stack for GC pointers
>>> to save them I think. We might think that it's a bit too much to do it for
>>> every single call there.
>>
>> So, what's the plan of doing something about it?
> 
> I took a look at it and it seems it's not what I thought it is.
> 
> It's just an intermediate call that saves stack roots and then calls
> the actual cpyext. I don't think the call itself is harmful, it just
> happens to be on the callstack (always)

Ah, ok - good to know. Then I think our next best bet is to cache the
PyObject structs for PyPy objects using a weak-key dict. That will fix two
problems at the same time:

1) prune excessive create-decref-dealloc cycles

2) keep the PyObject pointer valid as long as the PyPy object is alive,
thus preventing crashes for code that expects an object reference in a list
(for example) to be enough to keep the C object representation alive.

Stefan




More information about the pypy-dev mailing list