[pypy-dev] Pypy garbage collection

Martin Koch mak at issuu.com
Tue Mar 18 11:55:36 CET 2014


Thanks, Carl

I think that the part of the mail thread with the timing measurements that
show that it is many gc-collect-steps and not one single major gc is also
relevant for the bug, so that this information won't have to be
rediscovered whenever someone gets the time to look at the bug :)

I.e. that is the mail with lines like this one:

*Totals*: gc-minor:380 slow:1 gc-minor-walkroots:0 gc-collect-step:30207
*Max*: gc-minor:8 gc-collect-step:245

It might also be relevant to include info on the command line to reproduce
the problem:

pypy mem.py 10000000


Thanks,
/Martin


On Tue, Mar 18, 2014 at 11:23 AM, Carl Friedrich Bolz <cfbolz at gmx.de> wrote:

> Agreed, somehow this should not happen.
>
> Anyway, I'm not the person to look into this, but I filed a bug, so at
> least your example code does not get lost:
>
> https://bugs.pypy.org/issue1710
>
> Cheers,
>
> Carl Friedrich
>
>
> Martin Koch <mak at issuu.com> wrote:
>>
>> Thanks, Carl.
>>
>> This bit of code certainly exhibits the surprising property that some
>> runs unpredictably stall for a very long time. Further, it seems that this
>> stall time can be made arbitrarily large by increasing the number of nodes
>> generated (== more data in the old generation == more stuff to traverse if
>> lots of garbage is generated and survives the young generation?). As a user
>> of an incremental garbage collector, I would expect that there are pauses
>> due to GC, but that these are predictable and small.
>>
>> I tried running
>>
>> PYPY_GC_NURSERY=2000M pypy ./mem.py 10000000
>>
>> but that seemed to have no effect.
>>
>> I'm looking forward to the results of the Software Transactional Memory,
>> btw :)
>>
>> /Martin
>>
>>
>> On Tue, Mar 18, 2014 at 9:47 AM, Carl Friedrich Bolz <cfbolz at gmx.de>wrote:
>>
>>> On 17/03/14 20:04, Martin Koch wrote:
>>> > Well, it would appear that we have the problem because we're generating
>>> > a lot of garbage in the young generation, just like we're doing in the
>>> > example we've been studying here.
>>>
>>> No, I think it's because your generating a lot of garbage in the *old*
>>> generation. Meaning objects which survive one minor collection but then
>>> die.
>>>
>>> > I'm unsure how we can avoid that in
>>> > our real implementation. Can we force gc of the young generation?
>>> Either
>>> > by gc.collect() or implcitly somehow (does the gc e.g. kick in across
>>> > function calls?).
>>>
>>> That would make matters worse, because increasing the frequency of
>>> minor collects means *more* objects get moved to the old generation
>>> (where they cause problems). So indeed, maybe in your case making the
>>> new generation bigger might help. This can be done using
>>> PYPY_GC_NURSERY, I think (nursery is the space reserved for young
>>> objects). The risk is that minor collections become unreasonably slow.
>>>
>>> Anyway, if the example code you gave us also shows the problem I think
>>> we should eventually look into it. It's not really fair to say "but
>>> you're allocating too much!" to explain why the GC takes a lot of time.
>>>
>>> Cheers,
>>>
>>> Carl Friedrich
>>> _______________________________________________
>>> pypy-dev mailing list
>>> pypy-dev at python.org
>>> https://mail.python.org/mailman/listinfo/pypy-dev
>>>
>>
>>
>
> Carl Friedrich
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20140318/86db2f30/attachment.html>


More information about the pypy-dev mailing list