[pypy-dev] Help Understanding Memory Consumption

Robert Whitcher robert.whitcher at rubrik.com
Wed Aug 21 19:02:42 EDT 2019


Hi,
I am running a very simple test case (as we are hitting OOM on our larger
PyPy deployments) and I'd love some help understanding what is happening
here....
We have a lot of processes that send messages to each other.
These can be large JSON serializations of objects.
But the memory being consumed seems out of order and hard to manage across
processes.

I have this loop running:

import time
import json

def main():
    with open("/tmp/test12334.1234", "r") as f:
        json_msg = f.read()

    while True:
        j = json.loads(json_msg)
        time.sleep(10)

if __name__ == "__main__":
    main()


I have tried 3 separate general runs across both pypy and cpython.
The first does nothing but the sleep (it doesn't load or json the message)
The second just loaded the json_str from a file
The third is the full loop.

If I run this in cpython I get (80MB, 92MB and 136MB) respectively
This makes sense as the file is 11MB json serialization of a dictionary and
json.loads takes up some memory

However if I run this in pypy I get 120MB, 153MB and between 360-405MB when
it settles out.
I get the JIT and startup memory being higher, spending a little more
loading the string but WOW does json loading the string chew up a bunch.

Multiplying that memory across processes is eating a bunch.

What easy things am I missing?

Thanks,
Rob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20190821/af7cd7b1/attachment.html>


More information about the pypy-dev mailing list