[pypy-dev] Pypy garbage collection

Martin Koch mak at issuu.com
Wed Mar 12 23:06:19 CET 2014


Hi List

I'm running a server (written in python, executed with pypy) that holds a
large graph (55GB, millions of nodes and edges) in memory and responds to
queries by traversing the graph.The graph is mutated a few times a second,
and there are hundreds of read-only requests a second.

My problem is that I no control over garbage collection. Thus, a major GC
might kick in while serving a query, and with this amount of data, the GC
takes around 2 minutes. I have tried mitigating this by guessing when a GC
might be due, and proactively starting the garbage collector while not
serving a request (this is ok, as duplicate servers will respond to
requests while this one is collecting).

What I would really like is to be able to disable garbage collection for
the old generation. This is because the graph is fairly static, and I can
live with leaking memory from the relatively few and small mutations that
occur. Any queries are only likely to generate objects in the new
generation, and it is fine to collect these. Also, by design, the process
is periodically restarted in order to re-synchronize it with an
authoritative source (thus rebuilding the graph from scratch), so slight
leakage is not an issue here.

I have tried experimenting with setting
environment<http://pypy.readthedocs.org/en/latest/gc_info.html>variables
as well as the 'gc' module, but nothing seems to give me what I
want.

If disabling gc for certain generations is not possible, it would be nice
to be able to get a hint when a major collection is about to occur, so I
can stop serving requests.

I'm using the following pypy version:
Python 2.7.3 (2.2.1+dfsg-1, Jan 24 2014, 10:12:37)
[PyPy 2.2.1 with GCC 4.6.3] on linux2

An additional question: pypy 2.2.1 should have incremental GC; shouldn't
that avoid long pauses due to garbage collection?

Thanks,
/Martin Koch - Senior Systems Architect - issuu.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20140312/ed9b99f7/attachment.html>


More information about the pypy-dev mailing list