Tried deleting the pymongo._cmessage module? It’s a cpyext one, so seems like a good place to start. I think it works without it.
Thanks Antonio...
Relative to cpyext... I am not sure. We are not directly, but who knows what is being used by incorporated modules (pymongo, etc..)Trying to create a stripped down test case...I filed bug here with some updates:Reproduces in pypy, pypy without JIT but *does not* reproduce in CPython so there is some indications thereOnce I get the stripped down test case that is not tied in to our entire codebase -- I will publish it in the bug if I can and try other PyPy versions as well.RobOn Thu, Mar 28, 2019 at 10:56 AM Antonio Cuni <anto.cuni@gmail.com> wrote:Hi Robert,are you using any package which relies on cpyext? I.e., modules written in C and/or with Cython (cffi is fine).IIRC, at the moment PyPy doesn't detect GC cycles which involve cpyext objects.So if you have a cycle which does e.g.Py_foo -> C_bar -> Py_foo(where Py_foo is a pure-python object and C_bar a cpyext object) they will never be collected unless you break the cycle manually.Other than that: have you tried running it with PyPy 7.0 and/or 7.1?On Thu, Mar 28, 2019 at 8:35 AM Robert Whitcher <robert.whitcher@rubrik.com> wrote:______________________________So I have a process that use PyPy and pymongo in a loop.It does basically the same thing every loop, which query a table in via pymongo and do a few non-save calculations and then wait and loop againThe RSS of the process continually increased (the PYPY_GC_MAX is set pretty high).So I hooked in the GC stats output per: http://doc.pypy.org/en/latest/gc_info.html I also assure that gc.collect() was called at least every 3 minutes.What I see is that... The memory while high is fair constant for a long time:2019-03-27 00:04:10.033-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 144244736...2019-03-27 01:01:46.841-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 1444208642019-03-27 01:02:36.943-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 144269312Then it decides (an the exact per-loop behavior is the same each time) to chew up much more memory:2019-03-27 01:04:17.184-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 1454694402019-03-27 01:05:07.305-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 1581752322019-03-27 01:05:57.401-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 1731911682019-03-27 01:06:47.490-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 1969438722019-03-27 01:07:37.575-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 2054062082019-03-27 01:08:27.659-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 2545623042019-03-27 01:09:17.770-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 2560204802019-03-27 01:10:07.866-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 289779712That's 140 MB .... Where is all that memory going...What's more is that the PyPy GC stats do not show anything different:Here are the GC stats from GC-Complete when we were at 144MB:2019-03-26 23:55:49.127-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 1406320642019-03-26 23:55:49.133-0600 [-] main_thread(29621)log (async_worker_process.py:308): DBG0: Total memory consumed:GC used: 56.8MB (peak: 69.6MB)in arenas: 39.3MBrawmalloced: 14.5MBnursery: 3.0MBraw assembler used: 521.6kB-----------------------------Total: 57.4MBTotal memory allocated:GC allocated: 63.0MB (peak: 71.2MB)in arenas: 43.9MBrawmalloced: 22.7MBnursery: 3.0MBraw assembler allocated: 1.0MB-----------------------------Total: 64.0MBHere are the GC stats from GC-Complete when we are at 285MB:2019-03-27 01:42:41.751-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 2851471362019-03-27 01:42:41.751-0600 [-] main_thread(29621)log (async_worker_process.py:308): DBG0: Total memory consumed:GC used: 57.5MB (peak: 69.6MB)in arenas: 39.9MBrawmalloced: 14.6MBnursery: 3.0MBraw assembler used: 1.5MB-----------------------------Total: 58.9MBTotal memory allocated:GC allocated: 63.1MB (peak: 71.2MB)in arenas: 43.9MBrawmalloced: 22.7MBnursery: 3.0MBraw assembler allocated: 2.0MB-----------------------------Total: 65.1MBHow is this possible?I am measuring RSS with:def get_rss_mem_usage():'''Get the RSS memory usage in bytes@return: memory size in bytes; -1 if error occurs'''try:process = psutil.Process(os.getpid())return process.get_memory_info().rssexcept:return -1And cross referencing with "ps -orss -p <pid>" and the RSS values reported are correct....I cannot figure out where to go from here with this as it appears that PyPy is leaking this memory somehow...And I have no idea howto proceed from here...I end up having memory problems and getting Memory Warnings for a process that just loops and queries via pymongo
Pymongo version is 3.7.1This is:Python 2.7.13 (ab0b9caf307db6592905a80b8faffd 69b39005b8, Apr 30 2018, 08:21:35) [PyPy 6.0.0 with GCC 7.2.0]_________________
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev