RFC: PEP 454: Add a new tracemalloc module

Hi, Antoine Pitrou suggested me to write a PEP to discuss the API of the new tracemalloc module that I proposed to add to Python 3.4. Here you have. If you prefer to read the HTML version: http://www.python.org/dev/peps/pep-0454/ See also the documentation of the current implementation of the module. http://hg.python.org/features/tracemalloc/file/tip/Doc/library/tracemalloc.r... The documentaion contains examples and a short "tutorial". PEP: 454 Title: Add a new tracemalloc module to trace Python memory allocations Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner <victor.stinner@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 3-September-2013 Python-Version: 3.4 Abstract ======== Add a new ``tracemalloc`` module to trace Python memory allocations. Rationale ========= Common debug tools tracing memory allocations read the C filename and number. Using such tool to analyze Python memory allocations does not help because most memory allocations are done in the same C function, ``PyMem_Malloc()`` for example. There are debug tools dedicated to the Python languages like ``Heapy`` and ``PySizer``. These projects analyze objects type and/or content. These tools are useful when the most memory leak are instances of the same type and this type in allocated only in a few functions. The problem is when the object type is very common like ``str`` or ``tuple``, and it is hard to identify where these objects are allocated. Finding reference cycles is also a difficult task. There are different tools to draw a diagram of all references. These tools cannot be used huge on large applications with thousands of objects because the diagram is too huge to be analyzed manually. Proposal ======== Using the PEP 445, it becomes easy to setup an hook on Python memory allocators. The hook can inspect the current Python frame to get the Python filename and line number. This PEP proposes to add a new ``tracemalloc`` module. It is a debug tool to trace memory allocations made by Python. The module provides the following information: * Statistics on Python memory allocations per Python filename and line number: size, number, and average size of allocations * Compute differences between two snapshots of Python memory allocations * Location of a Python memory allocation: size in bytes, Python filename and line number Command line options ==================== The ``python -m tracemalloc`` command can be used to analyze and compare snapshots. The command takes a list of snapshot filenames and has the following options. ``-g``, ``--group-per-file`` Group allocations per filename, instead of grouping per line number. ``-n NTRACES``, ``--number NTRACES`` Number of traces displayed per top (default: 10). ``--first`` Compare with the first snapshot, instead of comparing with the previous snapshot. ``--include PATTERN`` Only include filenames matching pattern *PATTERN*. The option can be specified multiple times. See ``fnmatch.fnmatch()`` for the syntax of patterns. ``--exclude PATTERN`` Exclude filenames matching pattern *PATTERN*. The option can be specified multiple times. See ``fnmatch.fnmatch()`` for the syntax of patterns. ``-S``, ``--hide-size`` Hide the size of allocations. ``-C``, ``--hide-count`` Hide the number of allocations. ``-A``, ``--hide-average`` Hide the average size of allocations. ``-P PARTS``, ``--filename-parts=PARTS`` Number of displayed filename parts (default: 3). ``--color`` Force usage of colors even if ``sys.stdout`` is not a TTY device. ``--no-color`` Disable colors if ``sys.stdout`` is a TTY device. API === To trace the most Python memory allocations, the module should be enabled as early as possible in your application by calling ``tracemalloc.enable()`` function, by setting the ``PYTHONTRACEMALLOC`` environment variable to ``1``, or by using ``-X tracemalloc`` command line option. Functions --------- ``enable()`` function: Start tracing Python memory allocations. ``disable()`` function: Stop tracing Python memory allocations and stop the timer started by ``start_timer()``. ``is_enabled()`` function: Get the status of the module: ``True`` if it is enabled, ``False`` otherwise. ``get_object_address(obj)`` function: Get the address of the memory block of the specified Python object. ``get_object_trace(obj)`` function: Get the trace of a Python object *obj* as a ``trace`` instance. Return ``None`` if the tracemalloc module did not save the location when the object was allocated, for example if the module was disabled. ``get_process_memory()`` function: Get the memory usage of the current process as a meminfo namedtuple with two attributes: * ``rss``: Resident Set Size in bytes * ``vms``: size of the virtual memory in bytes Return ``None`` if the platform is not supported. Use the ``psutil`` module if available. ``get_stats()`` function: Get statistics on Python memory allocations per Python filename and per Python line number. Return a dictionary ``{filename: str -> {line_number: int -> stats: line_stat}}`` where *stats* in a ``line_stat`` instance. *filename* and *line_number* can be ``None``. Return an empty dictionary if the tracemalloc module is disabled. ``get_traces(obj)`` function: Get all traces of a Python memory allocations. Return a dictionary ``{pointer: int -> trace}`` where *trace* is a ``trace`` instance. Return an empty dictionary if the ``tracemalloc`` module is disabled. ``start_timer(delay: int, func: callable, args: tuple=(), kwargs: dict={})`` function: Start a timer calling ``func(*args, **kwargs)`` every *delay* seconds. The timer is based on the Python memory allocator, it is not real time. *func* is called after at least *delay* seconds, it is not called exactly after *delay* seconds if no Python memory allocation occurred. If ``start_timer()`` is called twice, previous parameters are replaced. The timer has a resolution of 1 second. ``start_timer()`` is used by ``DisplayTop`` and ``TakeSnapshot`` to run regulary a task. ``stop_timer()`` function: Stop the timer started by ``start_timer()``. trace class ----------- ``trace`` class: This class represents debug information of an allocated memory block. ``size`` attribute: Size in bytes of the memory block. ``filename`` attribute: Name of the Python script where the memory block was allocated, ``None`` if unknown. ``lineno`` attribute: Line number where the memory block was allocated, ``None`` if unknown. line_stat class ---------------- ``line_stat`` class: Statistics on Python memory allocations of a specific line number. ``size`` attribute: Total size in bytes of all memory blocks allocated on the line. ``count`` attribute: Number of memory blocks allocated on the line. DisplayTop class ---------------- ``DisplayTop(count: int=10, file=sys.stdout)`` class: Display the list of the *count* biggest memory allocations into *file*. ``display()`` method: Display the top once. ``start(delay: int)`` method: Start a task using ``tracemalloc`` timer to display the top every *delay* seconds. ``stop()`` method: Stop the task started by the ``DisplayTop.start()`` method ``color`` attribute: If ``True``, ``display()`` uses color. The default value is ``file.isatty()``. ``compare_with_previous`` attribute: If ``True`` (default value), ``display()`` compares with the previous snapshot. If ``False``, compare with the first snapshot. ``filename_parts`` attribute: Number of displayed filename parts (int, default: ``3``). Extra parts are replaced with ``"..."``. ``group_per_file`` attribute: If ``True``, group memory allocations per Python filename. If ``False`` (default value), group allocation per Python line number. ``show_average`` attribute: If ``True`` (default value), ``display()`` shows the average size of allocations. ``show_count`` attribute: If ``True`` (default value), ``display()`` shows the number of allocations. ``show_size`` attribute: If ``True`` (default value), ``display()`` shows the size of allocations. ``user_data_callback`` attribute: Optional callback collecting user data (callable, default: ``None``). See ``Snapshot.create()``. Snapshot class -------------- ``Snapshot()`` class: Snapshot of Python memory allocations. Use ``TakeSnapshot`` to take regulary snapshots. ``create(user_data_callback=None)`` method: Take a snapshot. If *user_data_callback* is specified, it must be a callable object returning a list of ``(title: str, format: str, value: int)``. *format* must be ``'size'``. The list must always have the same length and the same order to be able to compute differences between values. Example: ``[('Video memory', 'size', 234902)]``. ``filter_filenames(patterns: list, include: bool)`` method: Remove filenames not matching any pattern of *patterns* if *include* is ``True``, or remove filenames matching a pattern of *patterns* if *include* is ``False`` (exclude). See ``fnmatch.fnmatch()`` for the syntax of a pattern. ``load(filename)`` classmethod: Load a snapshot from a file. ``write(filename)`` method: Write the snapshot into a file. ``pid`` attribute: Identifier of the process which created the snapshot (int). ``process_memory`` attribute: Result of the ``get_process_memory()`` function, can be ``None``. ``stats`` attribute: Result of the ``get_stats()`` function (dict). ``timestamp`` attribute: Creation date and time of the snapshot, ``datetime.datetime`` instance. ``user_data`` attribute: Optional list of user data, result of *user_data_callback* in ``Snapshot.create()`` (default: None). TakeSnapshot class ------------------ ``TakeSnapshot`` class: Task taking snapshots of Python memory allocations: write them into files. By default, snapshots are written in the current directory. ``start(delay: int)`` method: Start a task taking a snapshot every delay seconds. ``stop()`` method: Stop the task started by the ``TakeSnapshot.start()`` method. ``take_snapshot()`` method: Take a snapshot. ``filename_template`` attribute: Template (``str``) used to create a filename. The following variables can be used in the template: * ``$pid``: identifier of the current process * ``$timestamp``: current date and time * ``$counter``: counter starting at 1 and incremented at each snapshot The default pattern is ``'tracemalloc-$counter.pickle'``. ``user_data_callback`` attribute: Optional callback collecting user data (callable, default: ``None``). See ``Snapshot.create()``. Links ===== Python issues: * `#18874: Add a new tracemalloc module to trace Python memory allocations <http://bugs.python.org/issue18874>`_ Similar projects: * `Meliae: Python Memory Usage Analyzer <https://pypi.python.org/pypi/meliae>`_ * `Guppy-PE: umbrella package combining Heapy and GSL <http://guppy-pe.sourceforge.net/>`_ * `PySizer <http://pysizer.8325.org/>`_: developed for Python 2.4 * `memory_profiler <https://pypi.python.org/pypi/memory_profiler>`_ * `pympler <http://code.google.com/p/pympler/>`_ * `Dozer <https://pypi.python.org/pypi/Dozer>`_: WSGI Middleware version of the CherryPy memory leak debugger * `objgraph <http://mg.pov.lt/objgraph/>`_ * `caulk <https://github.com/smartfile/caulk/>`_ Copyright ========= This document has been placed into the public domain.

``get_object_trace(obj)`` function:
Get the trace of a Python object *obj* as a ``trace`` instance.
Return ``None`` if the tracemalloc module did not save the location when the object was allocated, for example if the module was disabled.
This function and get_traces() can be reused by other debug tools like Heapy and objgraph to add where objects were allocated.
``get_stats()`` function:
Get statistics on Python memory allocations per Python filename and per Python line number.
Return a dictionary ``{filename: str -> {line_number: int -> stats: line_stat}}`` where *stats* in a ``line_stat`` instance. *filename* and *line_number* can be ``None``.
Return an empty dictionary if the tracemalloc module is disabled.
``get_traces(obj)`` function:
Get all traces of a Python memory allocations. Return a dictionary ``{pointer: int -> trace}`` where *trace* is a ``trace`` instance.
Return an empty dictionary if the ``tracemalloc`` module is disabled.
get_stats() can computed from get_traces(), example: ----- import pprint, tracemalloc traces = tracemalloc.get_traces() stats = {} for trace in traces.values(): if trace.filename not in stats: stats[trace.filename] = line_stats = {} else: line_stats = stats[trace.filename] if trace.lineno not in line_stats: line_stats[trace.lineno] = line_stat = tracemalloc.line_stat((0, 0)) size = trace.size count = 1 else: line_stat = line_stats[trace.lineno] size = line_stat.size + trace.size count = line_stat.count + 1 line_stats[trace.lineno] = tracemalloc.line_stat((size, count)) pprint.pprint(stats) ----- The problem is the efficiency. At startup, Python already allocated more than 20,000 memory blocks: $ ./python -X tracemalloc -c 'import tracemalloc; print(len(tracemalloc.get_traces()))' 21704 At the end of the Python test suite, Python allocated more than 500,000 memory blocks. Storing all these traces in a snapshot eats a lot of memory, disk space and uses CPU to build the statistics.
``start_timer(delay: int, func: callable, args: tuple=(), kwargs: dict={})`` function:
Start a timer calling ``func(*args, **kwargs)`` every *delay* seconds. (...)
If ``start_timer()`` is called twice, previous parameters are replaced. The timer has a resolution of 1 second.
``start_timer()`` is used by ``DisplayTop`` and ``TakeSnapshot`` to run regulary a task.
So DisplayTop and TakeSnapshot cannot be used at the same time. It would be convinient to be able to register more than one function. What do you think?
``trace`` class: This class represents debug information of an allocated memory block.
``size`` attribute: Size in bytes of the memory block. ``filename`` attribute: Name of the Python script where the memory block was allocated, ``None`` if unknown. ``lineno`` attribute: Line number where the memory block was allocated, ``None`` if unknown.
I though twice and it would be posible to store more than 1 frame per trace instance, to be able to rebuild a (partial) Python traceback. The hook on the memory allocator has access to the chain of Python frames. The API should be changed to support such enhancement.
``DisplayTop(count: int=10, file=sys.stdout)`` class: Display the list of the *count* biggest memory allocations into *file*. (...) ``group_per_file`` attribute:
If ``True``, group memory allocations per Python filename. If ``False`` (default value), group allocation per Python line number.
This attribute is very important. We may add it to the constructor. By the way, the self.stream attribute is not documented.
Snapshot class --------------
``Snapshot()`` class:
Snapshot of Python memory allocations.
Use ``TakeSnapshot`` to take regulary snapshots.
``create(user_data_callback=None)`` method:
Take a snapshot. If *user_data_callback* is specified, it must be a callable object returning a list of ``(title: str, format: str, value: int)``. *format* must be ``'size'``. The list must always have the same length and the same order to be able to compute differences between values.
Example: ``[('Video memory', 'size', 234902)]``.
(Oops, create() is a class method, not a method.) Having to call a class method to build an instance of a class is surprising. But I didn't find a way to implement the load() class method otherwise. The user_data_callback API can be improved. The "format must be size" is not very convinient. Victor

``trace`` class: This class represents debug information of an allocated memory block.
``size`` attribute: Size in bytes of the memory block. ``filename`` attribute: Name of the Python script where the memory block was allocated, ``None`` if unknown. ``lineno`` attribute: Line number where the memory block was allocated, ``None`` if unknown.
I though twice and it would be posible to store more than 1 frame per trace instance, to be able to rebuild a (partial) Python traceback. The hook on the memory allocator has access to the chain of Python frames. The API should be changed to support such enhancement.
Oh, it was much easier than expected to retrieve the traceback (maximum 10 frames) instead of only the current frame. I modified the trace class to replace filename and lineno with a new frames attribute which is list of frames. Script example: --- import tracemalloc, linecache def g(): return object() def f(): return g() tracemalloc.enable() obj = f() trace = tracemalloc.get_object_trace(obj) print("Traceback (most recent first):") for frame in trace.frames: print(' File "%s", line %s' % (frame.filename, frame.lineno)) line = linecache.getline(frame.filename, frame.lineno) if line: print(" " + line.strip()) --- Output of the script: --- Traceback (most recent first): File "x.py", line 4 return object() File "x.py", line 7 return g() File "x.py", line 10 obj = f() --- I updated the PEP 454 (add a new frame class, update trace class): frame class ----------- ``frame`` class: Trace of a Python frame. ``filename`` attribute (``str``): Python filename, ``None`` if unknown. ``lineno`` attribute (``int``): Python line number, ``None`` if unknown. trace class ----------- ``trace`` class: This class represents debug information of an allocated memory block. ``size`` attribute (``int``): Size in bytes of the memory block. ``frames`` attribute (``list``): Traceback where the memory block was allocated as a list of ``frame`` instances (most recent first). The list can be empty or incomplete if the tracemalloc module was unable to retrieve the full traceback. For efficiency, the traceback is truncated to 10 frames. Victor

On Tue, Sep 3, 2013 at 7:27 PM, Victor Stinner <victor.stinner@gmail.com>wrote:
API ===
To trace the most Python memory allocations, the module should be enabled as early as possible in your application by calling ``tracemalloc.enable()`` function, by setting the ``PYTHONTRACEMALLOC`` environment variable to ``1``, or by using ``-X tracemalloc`` command line option.
Functions ---------
``enable()`` function:
Start tracing Python memory allocations.
``disable()`` function:
Stop tracing Python memory allocations and stop the timer started by ``start_timer()``.
``is_enabled()`` function:
Get the status of the module: ``True`` if it is enabled, ``False`` otherwise.
Please mention that this API is similar to that of faulthandler and add a link to faulthandler docs.

2013/9/4 Victor Stinner <victor.stinner@gmail.com>:
http://www.python.org/dev/peps/pep-0454/
PEP: 454 Title: Add a new tracemalloc module to trace Python memory allocations Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner <victor.stinner@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 3-September-2013 Python-Version: 3.4
I added a function get_tracemalloc_size() to see how much memory is used by the tracemalloc module itself. Result on the Python test suite: * 1 frame: +52% (+%68%) Python=34 MiB; _tracemalloc=18 MiB, tracemalloc.py=5 MiB * 10 frames: +155% (+170%) Python=34 MiB, _tracemalloc=53 MiB, tracemalloc.py=5 MiB * 100 frames: +1273% (+1283%) Python=30 MiB, _tracemalloc=382 MiB, tracemalloc.py=6 MiB On a small application and a computer with GB of memory, it may not matter. In a big application on an embedded device, it can be a blocker point to use tracemalloc. So I added filters (on the filename and line number) directly in the C module: ``add_filter(include: bool, filename: str, lineno: int=None)`` function: Add a filter. If *include* is ``True``, only trace memory blocks allocated in a file with a name matching *filename*. If *include* is ``False``, don't trace memory blocks allocated in a file with a name matching *filename*. The match is done using *filename* as a prefix. For example, ``'/usr/bin/'`` only matchs files the ``/usr/bin`` directories. The ``.pyc`` and ``.pyo`` suffixes are automatically replaced with ``.py`` when matching the filename. *lineno* is a line number. If *lineno* is ``None`` or lesser than ``1``, it matches any line number. ``clear_filters()`` function: Reset the filter list. ``get_filters()`` function: Get the filters as list of ``(include: bool, filename: str, lineno: int)`` tuples. If *lineno* is ``None``, a filter matchs any line number. By default, the filename of the Python tracemalloc module (``tracemalloc.py``) is excluded. Right now, the match is done using a PyUnicode_Tailmatch(). It is not convinient. I will see if it is possible to implement the joker character "*" matching any string, so the API would be closer to Snapshot.filter_filenames() (which uses fnmatch.fnmatch). Victor

It seems like most of this could live on PyPi for a while so the API can get hashed out in use? If that's not the case is it because the PEP 445 API isn't rich enough? Janzert

2013/9/8 Janzert <janzert@janzert.com>:
It seems like most of this could live on PyPi for a while so the API can get hashed out in use?
The pytracemalloc is available on PyPI since 6 months. The only feedback I had was something trying to compile it on Windows (which is complex because of the dependency to glib, I don't think that it succeed to install it on Windows). I guess that I didn't get more feedback because it requires to patch and recompile Python, which is not trivial. I expect more feedback on python-dev with a working implementation (on hg.python.org) and a PEP. The version available on PyPI works and should be enough for most use cases to be able to identify a memory leak. Gregory P. Smith asked me if it would be possible to get more frames (filename and line number) of the Python traceback, instead of just the one frame (the last frame). I implemented it, but now I have new issues (memory usage of the tracemalloc module itself), so I'm working on filters directly implemented in the C module (_tracemalloc). It was already possible to filter traces from a snapshot read from the disk. I still have some tasks in my TODO list to finish the API and the implementation. When I will be done, I will post post a new version of the PEP on python-dev.
If that's not the case is it because the PEP 445 API isn't rich enough?
The PEP 445 API is only designed to allow to develop new tools like failmalloc or tracemalloc, without adding overhead if such debug tool is not used. The tracemalloc module reads the current Python traceback (filename and line number) which is "not directly" accessible from PyMem_Malloc(). I hope that existing tools like Heapy and Melia will benefit directly from tracemalloc instead of having to develop their own memory allocator hooks to get the same information (Python traceback). Victor
participants (3)
-
Alexander Belopolsky
-
Janzert
-
Victor Stinner