tl;dr Do you use "perf.py --track_memory"? If yes, for which purpose?
Are you using it on Windows or Linux?
I'm working on the CPython benchmark suite. It has a --track_memory
command option to measure the peak of the memory usage. A main process
runs worker processes and track their memory usage.
On Linux, the main process reads the "private data" from
/proc/pid/smaps of a worker process. It uses a busy-loop: it reads
/proc/pid/smaps as fast as possible (with no sleep)!
On Windows, PeakPagefileUsage of GetProcessMemoryInfo(process_handle)
is used. It uses a loop using a sleep of 1 ms.
Do you think that the Linux implementation is reliable? What happens
if the worker process only reachs its peak during 1 ms but the main
process (the watcher) reads the memory usage every 10 ms?
The exact value probably also depends a lot on how the operating
system computes the memory usage. RSS is very different from PSS
(proportional set size) for example. Linux has also "USS" (unshared
I would prefer to implement the code to track memory in the worker
process directly. On Windows, it looks reliable to get the peak after
each run. On Linux, it is less clear. Should I use a thread reading
/proc/self/smaps in a busy loop?
For me, the most reliable option is to use tracemalloc to get the peak
of the *Python* memory usage. But this module is only available on
Python 3.4 and newer. Another issue is that it slows down a lot the
code (something like 2x slower!).
I guess that they are two use cases:
- read coarse memory usage but don't hit performance
- read precise memory usage, ignore performance