[pypy-dev] Help Understanding Memory Consumption

Robert Whitcher robert.whitcher at rubrik.com
Thu Aug 22 07:34:11 EDT 2019


Shared test file with Carl (to avoid posting to everyones inbox).
PyPy version is currently 6.0 (I don't have ability to affect a change here
unless I can prove something)

On Thu, Aug 22, 2019 at 1:00 AM Carl Friedrich Bolz-Tereick <cfbolz at gmx.de>
wrote:

> Hi Rob,
>
> Which version of PyPy are you running this with? I have a long running
> branch that I really should merge someday that is supposed to help with
> memory consumption of json deserialization. Is there a chance you could
> share a (anonymized) version of your test file?
>
> Alternatively, you could try a nightly build from this branch yourself:
>
> http://buildbot.pypy.org/nightly/json-decoder-maps-py3.6/
>
> Carl Friedrich
>
> On August 22, 2019 1:02:42 AM GMT+02:00, Robert Whitcher <
> robert.whitcher at rubrik.com> wrote:
>>
>> Hi,
>> I am running a very simple test case (as we are hitting OOM on our larger
>> PyPy deployments) and I'd love some help understanding what is happening
>> here....
>> We have a lot of processes that send messages to each other.
>> These can be large JSON serializations of objects.
>> But the memory being consumed seems out of order and hard to manage
>> across processes.
>>
>> I have this loop running:
>>
>> import time
>> import json
>>
>> def main():
>>     with open("/tmp/test12334.1234", "r") as f:
>>         json_msg = f.read()
>>
>>     while True:
>>         j = json.loads(json_msg)
>>         time.sleep(10)
>>
>> if __name__ == "__main__":
>>     main()
>>
>>
>> I have tried 3 separate general runs across both pypy and cpython.
>> The first does nothing but the sleep (it doesn't load or json the message)
>> The second just loaded the json_str from a file
>> The third is the full loop.
>>
>> If I run this in cpython I get (80MB, 92MB and 136MB) respectively
>> This makes sense as the file is 11MB json serialization of a dictionary
>> and json.loads takes up some memory
>>
>> However if I run this in pypy I get 120MB, 153MB and between 360-405MB
>> when it settles out.
>> I get the JIT and startup memory being higher, spending a little more
>> loading the string but WOW does json loading the string chew up a bunch.
>>
>> Multiplying that memory across processes is eating a bunch.
>>
>> What easy things am I missing?
>>
>> Thanks,
>> Rob
>>
>>
>>

-- 
[image: photo]
Robert Whitcher
Member of Technical Staff at Rubrik
M  512-633-1771  <512-633-1771> E  robert.whitcher at rubrik.com
<robert.whitcher at rubrik.com> W  www.rubrik.com
<http://www.rubrik.com?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20190822/2c5abf46/attachment-0001.html>


More information about the pypy-dev mailing list