Hello there,

I would like to discuss a proposal regarding one aspect which AFAIK is currently missing from cPython's test suite: the ability to detect memory leaks of functions implemented in the C extension modules.

In psutil I use a test class/framework which calls a function many times, and fails if the process memory increased after doing so. I do this in order to quickly detect missing free() or Py_DECREF calls in the C code, but I suppose there may be other use cases. Here's the class:

https://github.com/giampaolo/psutil/blob/913d4b1d6dcce88dea6ef9382b93883a04a66cd7/psutil/tests/__init__.py#L901

Detecting a memory leak is no easy task, and that's because the process memory fluctuates. Sometimes it may increase (or even decrease!) even if there's no leak, I suppose because of how the OS handles memory, the Python's garbage collector, the fact that RSS is an approximation, and who knows what else. In order to compensate fluctuations I did the following: in case of failure (mem > 0 after calling fun() N times) I retry the test for up to 5 times, increasing N (repetitions) each time, so I consider the test a failure only if the memory keeps increasing across all runs. So for instance, here's a legitimate failure:

psutil.tests.test_memory_leaks.TestModuleFunctionsLeaks.test_disk_partitions ...
Run #1: extra-mem=696.0K, per-call=3.5K, calls=200
Run #2: extra-mem=1.4M, per-call=3.5K, calls=400
Run #3: extra-mem=2.1M, per-call=3.5K, calls=600
Run #4: extra-mem=2.7M, per-call=3.5K, calls=800
Run #5: extra-mem=3.4M, per-call=3.5K, calls=1000
FAIL

If, on the other hand, the memory increased on one run (say 200 calls) but decreased on the next run (say 400 calls), then it clearly means it's a false positive, because memory consumption may be > 0 on the second run, but if it's lower than the previous run with less repetitions, then it cannot possibly represent a leak (just a fluctuation):

psutil.tests.test_memory_leaks.TestModuleFunctionsLeaks.test_net_connections ...

Run #1: extra-mem=568.0K, per-call=2.8K, calls=200
Run #2: extra-mem=24.0K, per-call=61.4B, calls=400
OK

This is the best I could come up with as a simple leak detection mechanism to integrate with CI services, and keep more advanced tools like Valgrind out of the picture (I just wanted to know if there's a leak, not to debug the leak itself). In addition, since psutil is able to get the number of fds (UNIX) and handles (Windows) opened by a process, I also run a separate set of tests to make sure I didn't forget to call close(2) or CloseHandle() in C.

Would something like this make sense to have in cPython? Here's a quick PoC I put together just to show how this thing would look like in practice:

https://github.com/giampaolo/cpython/pull/2/files

A proper work in terms of API coverage would result being quite huge (test all C modules), and ideally should also include cases where functions raise an exception when being fed with an improper input. The biggest stopper here is, of course, psutil, since it's a third party dep, but before getting to that I wanted to see how this idea is perceived in general.

Cheers,

Giampaolo - http://grodola.blogspot.com