[Python-Dev] Updated PEP 454 (tracemalloc): no more metrics!

Kristján Valur Jónsson kristjan at ccpgames.com
Wed Oct 23 22:06:42 CEST 2013


This might be a good place to make some comments.
I have discussed some of this in private with Victor, but wanted to make them here, for the record.

Mainly, I agree with removing code.  I'd like to go further, since in my experience, the less code in C, the better.

1) really, all that is required in terms of data is the traceback.get_traces() function.  Further, it _need_ not return addresses since they are not required for analysis.  It is sufficient for it to return a list of (traceback, size, count) tuples.   I understand that the get_stats function is useful for quick information so it can be kept, although it provides no added information, only convenience
2) get_object_address() and get_trace(address) functions seem redundant.  All that is required is get_object_traceback(), I think.
3) set_traceback_limit().  Truncating tracebacks is bad.  Particularly if it is truncated at the top end of the callstack, because then information looses cohesion, namely, the common connection point, the root.  If traceback limits are required, I suggest being able to specifiy that we truncate the leaf-end of the tracebacks.
4) add_filter().  This is unnecessary. Information can be filtered on the python side.  Defining Filter as a C type is not necessary.  Similarly, module level filter functions can be dropped.
5) Filter, Snapshot, GroupedStats, Statistics:  These classes, if required, can be implemented in a .py module.
6) Snapshot dump/load():  It is unusual to see load and save functions taking filenames in a python module, and a module implementing its own file IO.  I have suggested simply to add Pickle support.  Alternatively, support file-like objects or bytes (loads/dumps)

My experience is that performance and memory use hardly ever matters when you are doing diagnostic analysis of a program.  By definition, you are examining your program in a lab and you can afford 2 times, or 10 times, the memory use, and the slowing down of the program by 2 to 10.  I think it might be premature to move all of the statistics and analysis into the PEP and into C, because a) it assumes the need to optimize and b) it sets the specification in stone, before the module gets the chance to be honed by actual real-world use cases.

I'd also like to point out (just to say "I told you so" :) ) that this module is precisely the reason I suggested we include "const char *file, int lineno" in the API for PEP 445, because that would allow us, in debug builds, to get one extra stack level, namely the position of the actual C allocation in the python source.

If the above sounds negative, then that's not the intent.  I'm really happy Victor is putting in this effort here and I know this will be an essential tool for the future Python developer.  Those that brave the jump to version 3, that is :)

Cheers,

Kristján

________________________________________
Frá: Python-Dev [python-dev-bounces+kristjan=ccpgames.com at python.org] fyrir hönd Victor Stinner [victor.stinner at gmail.com]
Sent: 23. október 2013 18:25
To: Python Dev
Efni: [Python-Dev] Updated PEP 454 (tracemalloc): no more metrics!

Hi,

I was at the restaurant with Charles-François and Antoine yesterday to
discuss the PEP 454 (tracemalloc). They gave me a lot of advices to
improve the PEP. Most remarks were request to remove code :-) I also
improved surprising/strange APIs (like the infamous
GroupedStats.compate_to(None)).

HTML version:
http://www.python.org/dev/peps/pep-0454/

See also the documentation of the implementation, especially examples:
http://www.haypocalc.com/tmp/tracemalloc/library/tracemalloc.html#examples


Major changes:

* GroupedStats.compare_to()/statistics() now returns a list of
Statistic instances instead of a tuple with 5 items
* StatsDiff class has been removed
* Metrics have been removed
* Remove Filter.match*() methods
* Replace get_object_trace() function with get_object_traceback()
* More complete list of prior work. There are 11 Python projects to
debug memory leaks! I mentioned that PySizer implemented something
similar to tracemalloc 8 years ago. I also rewrote the Rationale
section
* Rename some classes, attributes and functions

Mercurial log of the PEP:
http://hg.python.org/peps/log/f851d4a1622a/pep-0454.txt



PEP: 454
Title: Add a new tracemalloc module to trace Python memory allocations
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <victor.stinner at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 3-September-2013
Python-Version: 3.4


Abstract
========

This PEP proposes to add a new ``tracemalloc`` module to trace memory
blocks allocated by Python.


Rationale
=========

Classic generic tools like Valgrind can get the C traceback where a
memory block was allocated. Using such tools to analyze Python memory
allocations does not help because most memory blocks are allocated in
the same C function, in ``PyMem_Malloc()`` for example. Moreover, Python
has an allocator for small object called "pymalloc" which keeps free
blocks for efficiency. This is not well handled by these tools.

There are debug tools dedicated to the Python language like ``Heapy``
``Pympler`` and ``Meliae`` which lists all live objects using the
garbage module (functions like ``gc.get_objects()``,
``gc.get_referrers()`` and ``gc.get_referents()``), compute their size
(ex: using ``sys.getsizeof()``) and group objects by type. These tools
provide a better estimation of the memory usage of an application.  They
are useful when most memory leaks are instances of the same type and
this type is only instantiated in a few functions. Problems arise when
the object type is very common like ``str`` or ``tuple``, and it is hard
to identify where these objects are instantiated.

Finding reference cycles is also a difficult problem.  There are
different tools to draw a diagram of all references.  These tools
cannot be used on large applications with thousands of objects because
the diagram is too huge to be analyzed manually.


Proposal
========

Using the customized allocation API from PEP 445, it becomes easy to
set up a hook on Python memory allocators. A hook can inspect Python
internals to retrieve Python tracebacks. The idea of getting the current
traceback comes from the faulthandler module. The faulthandler dumps
the traceback of all Python threads on a crash, here is the idea is to
get the traceback of the current Python thread when a memory block is
allocated by Python.

This PEP proposes to add a new ``tracemalloc`` module, as a debug tool
to trace memory blocks allocated by Python. The module provides the
following information:

* Statistics on allocated memory blocks per filename and per line
  number: total size, number and average size of allocated memory blocks
* Computed differences between two snapshots to detect memory leaks
* Traceback where a memory block was allocated

The API of the tracemalloc module is similar to the API of the
faulthandler module: ``enable()``, ``disable()`` and ``is_enabled()``
functions, an environment variable (``PYTHONFAULTHANDLER`` and
``PYTHONTRACEMALLOC``), and a ``-X`` command line option (``-X
faulthandler`` and ``-X tracemalloc``). See the
`documentation of the faulthandler module
<http://docs.python.org/3/library/faulthandler.html>`_.

The idea of tracing memory allocations is not new. It was first
implemented in the PySizer project in 2005. PySizer was implemented
differently: the traceback was stored in frame objects and some Python
types were linked the trace with the name of object type. PySizer patch
on CPython adds a overhead on performances and memory footprint, even if
the PySizer was not used. tracemalloc attachs a traceback to the
underlying layer, to memory blocks, and has no overhead when the module
is disabled.

The tracemalloc module has been written for CPython. Other
implementations of Python may not be able to provide it.


API
===

To trace most memory blocks allocated by Python, the module should be
enabled as early as possible by setting the ``PYTHONTRACEMALLOC``
environment variable to ``1``, or by using ``-X tracemalloc`` command
line option. The ``tracemalloc.enable()`` function can be called at
runtime to start tracing Python memory allocations.

By default, a trace of an allocated memory block only stores the most
recent frame (1 frame). To store 25 frames at startup: set the
``PYTHONTRACEMALLOC`` environment variable to ``25``, or use the ``-X
tracemalloc=25`` command line option. The ``set_traceback_limit()``
function can be used at runtime to set the limit.

By default, Python memory blocks allocated in the ``tracemalloc`` module
are ignored using a filter. Use ``clear_filters()`` to trace also these
memory allocations.


Main Functions
--------------

``reset()`` function:

    Clear traces and statistics on Python memory allocations.

    See also ``disable()``.


``disable()`` function:

    Stop tracing Python memory allocations and clear traces and
    statistics.

    See also ``enable()`` and ``is_enabled()`` functions.


``enable()`` function:

    Start tracing Python memory allocations.

    See also ``disable()`` and ``is_enabled()`` functions.


``get_stats()`` function:

    Get statistics on traced Python memory blocks as a dictionary
    ``{filename (str): {line_number (int): stats}}`` where *stats* in a
    ``(size: int, count: int)`` tuple, *filename* and *line_number* can
    be ``None``.

    *size* is the total size in bytes of all memory blocks allocated on
    the line, or *count* is the number of memory blocks allocated on the
    line.

    Return an empty dictionary if the ``tracemalloc`` module is
    disabled.

    See also the ``get_traces()`` function.


``get_traced_memory()`` function:

    Get the current size and maximum size of memory blocks traced by the
    ``tracemalloc`` module as a tuple: ``(size: int, max_size: int)``.


``get_tracemalloc_memory()`` function:

    Get the memory usage in bytes of the ``tracemalloc`` module used
    internally to trace memory allocations. Return an ``int``.


``is_enabled()`` function:

    ``True`` if the ``tracemalloc`` module is tracing Python memory
    allocations, ``False`` otherwise.

    See also ``disable()`` and ``enable()`` functions.


Trace Functions
---------------

When Python allocates a memory block, ``tracemalloc`` attachs a "trace" to
it to store information on it: its size in bytes and the traceback where the
allocation occured.

The following functions give access to these traces. A trace is a ``(size: int,
traceback)`` tuple. *size* is the size of the memory block in bytes.
*traceback* is a tuple of frames sorted from the most recent to the oldest
frame, limited to ``get_traceback_limit()`` frames. A frame is
a ``(filename: str, lineno: int)`` tuple where *filename* and *lineno* can be
``None``.

Example of trace: ``(32, (('x.py', 7), ('x.py', 11)))``.  The memory block has
a size of 32 bytes and was allocated at ``x.py:7``, line called from line
``x.py:11``.


``get_object_address(obj)`` function:

    Get the address of the main memory block of the specified Python
    object.

    A Python object can be composed by multiple memory blocks, the
    function only returns the address of the main memory block. For
    example, items of ``dict`` and ``set`` containers are stored in a
    second memory block.

    See also ``get_object_traceback()`` and ``gc.get_referrers()``
    functions.

    .. note::

       The builtin function ``id()`` returns a different address for
       objects tracked by the garbage collector, because ``id()``
       returns the address after the garbage collector header.


``get_object_traceback(obj)`` function:

    Get the traceback where the Python object *obj* was allocated.
    Return a tuple of ``(filename: str, lineno: int)`` tuples,
    *filename* and *lineno* can be ``None``.

    Return ``None`` if the ``tracemalloc`` module did not trace the
    allocation of the object.

    See also ``get_object_address()``, ``gc.get_referrers()`` and
    ``sys.getsizeof()`` functions.


``get_trace(address)`` function:

    Get the trace of a memory block allocated by Python. Return a tuple:
    ``(size: int, traceback)``, *traceback* is a tuple of ``(filename:
    str, lineno: int)`` tuples, *filename* and *lineno* can be ``None``.

    Return ``None`` if the ``tracemalloc`` module did not trace the
    allocation of the memory block.

    See also ``get_object_traceback()``, ``get_stats()`` and
    ``get_traces()`` functions.


``get_traceback_limit()`` function:

    Get the maximum number of frames stored in the traceback of a trace.

    By default, a trace of an allocated memory block only stores the
    most recent frame: the limit is ``1``. This limit is enough to get
    statistics using ``get_stats()``.

    Use the ``set_traceback_limit()`` function to change the limit.


``get_traces()`` function:

    Get traces of all memory blocks allocated by Python. Return a
    dictionary: ``{address (int): trace}``, *trace* is a ``(size: int,
    traceback)`` tuple, *traceback* is a tuple of ``(filename: str,
    lineno: int)`` tuples, *filename* and *lineno* can be None.

    Return an empty dictionary if the ``tracemalloc`` module is
    disabled.

    See also ``get_object_traceback()``, ``get_stats()`` and
    ``get_trace()`` functions.


``set_traceback_limit(nframe: int)`` function:

    Set the maximum number of frames stored in the traceback of a trace.

    Storing the traceback of each memory allocation has an important
    overhead on the memory usage. Use the ``get_tracemalloc_memory()``
    function to measure the overhead and the ``add_filter()`` function
    to select which memory allocations are traced.

    Use the ``get_traceback_limit()`` function to get the current limit.

    The ``PYTHONTRACEMALLOC`` environment variable and the ``-X``
    ``tracemalloc=NFRAME`` command line option can be used to set a
    limit at startup.


Filter Functions
----------------

``add_filter(filter)`` function:

    Add a new filter on Python memory allocations, *filter* is a
    ``Filter`` instance.

    All inclusive filters are applied at once, a memory allocation is
    only ignored if no inclusive filters match its trace. A memory
    allocation is ignored if at least one exclusive filter matchs its
    trace.

    The new filter is not applied on already collected traces. Use the
    ``reset()`` function to ensure that all traces match the new filter.

``add_inclusive_filter(filename_pattern: str, lineno: int=None,
traceback: bool=False)`` function:

    Add an inclusive filter: helper for the ``add_filter()`` function
    creating a ``Filter`` instance with the ``Filter.include`` attribute
    set to ``True``.

    The ``*`` joker character can be used in *filename_pattern* to match
    any substring, including empty string.

    Example: ``tracemalloc.add_inclusive_filter(subprocess.__file__)``
    only includes memory blocks allocated by the ``subprocess`` module.


``add_exclusive_filter(filename_pattern: str, lineno: int=None,
traceback: bool=False)`` function:

    Add an exclusive filter: helper for the ``add_filter()`` function
    creating a ``Filter`` instance with the ``Filter.include`` attribute
    set to ``False``.

    The ``*`` joker character can be used in *filename_pattern* to match
    any substring, including empty string.

    Example: ``tracemalloc.add_exclusive_filter(tracemalloc.__file__)``
    ignores memory blocks allocated by the ``tracemalloc`` module.


``clear_filters()`` function:

    Clear the filter list.

    See also the ``get_filters()`` function.


``get_filters()`` function:

    Get the filters on Python memory allocations. Return a list of
    ``Filter`` instances.

    By default, there is one exclusive filter to ignore Python memory
    blocks allocated by the ``tracemalloc`` module.

    See also the ``clear_filters()`` function.


Filter
------

``Filter(include: bool, filename_pattern: str, lineno: int=None,
traceback: bool=False)`` class:

    Filter to select which memory allocations are traced. Filters can be
    used to reduce the memory usage of the ``tracemalloc`` module, which
    can be read using the ``get_tracemalloc_memory()`` function.

    The ``*`` joker character can be used in *filename_pattern* to match
    any substring, including empty string. The ``.pyc`` and ``.pyo``
    file extensions are replaced with ``.py``. On Windows, the
    comparison is case insensitive and the alternative separator ``/``
    is replaced with the standard separator ``\``.

``include`` attribute:

    If *include* is ``True``, only trace memory blocks allocated in a
    file with a name matching ``filename_pattern`` at line number
    ``lineno``.

    If *include* is ``False``, ignore memory blocks allocated in a file
    with a name matching ``filename_pattern`` at line number ``lineno``.

``lineno`` attribute:

    Line number (``int``) of the filter. If *lineno* is is ``None`` or
    less than ``1``, the filter matches any line number.

``filename_pattern`` attribute:

    Filename pattern (``str``) of the filter.

``traceback`` attribute:

    If *traceback* is ``True``, all frames of the traceback are checked.
    If *traceback* is ``False``, only the most recent frame is checked.

    This attribute is ignored if the traceback limit is less than ``2``.
    See the ``get_traceback_limit()`` function.


GroupedStats
------------

``GroupedStats(timestamp: datetime.datetime, traceback_limit: int,
stats: dict, key_type: str, cumulative: bool)`` class:

    Top of allocated memory blocks grouped by *key_type* as a
    dictionary.

    The ``Snapshot.group_by()`` method creates a ``GroupedStats``
    instance.

``compare_to(old_stats: GroupedStats, sort=True)`` method:

    Compare statistics to an older ``GroupedStats`` instance. Return a
    list of ``Statistic`` instances.

    The result is sorted in the biggest to the smallest by
    ``abs(size_diff)``, *size*, ``abs(count_diff)``, *count* and then by
    *key*. Set the *sort* parameter to ``False`` to get the list
    unsorted.

    ``None`` values in keys are replaced with an empty string for
    filenames or zero for line numbers, because ``str`` and ``int``
    cannot be compared to ``None``.

    See also the ``statistics()`` method.

``statistics(sort=True)`` method:

    Get statistics as a list of ``Statistic`` instances.
    ``Statistic.size_diff`` and ``Statistic.count_diff`` attributes are
    set to zero.

    The result is sorted in the biggest to the smallest by
    ``abs(size_diff)``, *size*, ``abs(count_diff)``, *count* and then by
    *key*. Set the *sort* parameter to ``False`` to get the list
    unsorted.

    ``None`` values in keys are replaced with an empty string for
    filenames or zero for line numbers, because ``str`` and ``int``
    cannot be compared to ``None``.

    See also the ``compare_to()`` method.

``cumulative`` attribute:

    If ``True``, size and count of memory blocks of all frames of the
    traceback of a trace were cumulated, not only the most recent frame.

``key_type`` attribute:

    Determine how memory allocations were grouped: see
    ``Snapshot.group_by()()`` for the available values.

``stats`` attribute:

    Dictionary ``{key: (size: int, count: int)}`` where the type of
    *key* depends on the ``key_type`` attribute.

    See the ``Snapshot.group_by()`` method.

``traceback_limit`` attribute:

    Maximum number of frames stored in the traceback of ``traces``,
    result of the ``get_traceback_limit()`` function.

``timestamp`` attribute:

    Creation date and time of the snapshot, ``datetime.datetime``
    instance.


Snapshot
--------

``Snapshot(timestamp: datetime.datetime, traceback_limit: int, stats:
dict=None, traces: dict=None)`` class:

    Snapshot of statistics and traces of memory blocks allocated by
    Python.

``apply_filters(filters)`` method:

    Apply filters on the ``traces`` and ``stats`` dictionaries,
    *filters* is a list of ``Filter`` instances.


``create(traces=False)`` classmethod:

    Take a snapshot of statistics and traces of memory blocks allocated
    by Python.

    If *traces* is ``True``, ``get_traces()`` is called and its result
    is stored in the ``Snapshot.traces`` attribute. This attribute
    contains more information than ``Snapshot.stats`` and uses more
    memory and more disk space. If *traces* is ``False``,
    ``Snapshot.traces`` is set to ``None``.

    Tracebacks of traces are limited to ``traceback_limit`` frames. Call
    ``set_traceback_limit()`` before calling ``Snapshot.create()`` to
    store more frames.

    The ``tracemalloc`` module must be enabled to take a snapshot, see
    the the ``enable()`` function.

``dump(filename)`` method:

    Write the snapshot into a file.

    Use ``load()`` to reload the snapshot.


``load(filename)`` classmethod:

    Load a snapshot from a file.

    See also ``dump()``.


``group_by(key_type: str, cumulative: bool=False)`` method:

    Group statistics by *key_type* as a ``GroupedStats`` instance:

    =====================  ===================================
================================
    key_type               description                          type
    =====================  ===================================
================================
    ``'filename'``         filename                             ``str``
    ``'line'``             filename and line number
``(filename: str, lineno: int)``
    ``'address'``          memory block address                 ``int``
    ``'traceback'``        memory block address with traceback
``(address: int, traceback)``
    =====================  ===================================
================================

    The ``traceback`` type is a tuple of ``(filename: str, lineno:
    int)`` tuples, *filename* and *lineno* can be ``None``.

    If *cumulative* is ``True``, cumulate size and count of memory
    blocks of all frames of the traceback of a trace, not only the most
    recent frame. The *cumulative* parameter is set to ``False`` if
    *key_type* is ``'address'``, or if the traceback limit is less than
    ``2``.


``stats`` attribute:

    Statistics on traced Python memory, result of the ``get_stats()``
    function.

``traceback_limit`` attribute:

    Maximum number of frames stored in the traceback of ``traces``,
    result of the ``get_traceback_limit()`` function.

``traces`` attribute:

    Traces of Python memory allocations, result of the ``get_traces()``
    function, can be ``None``.

``timestamp`` attribute:

    Creation date and time of the snapshot, ``datetime.datetime``
    instance.


Statistic
---------

``Statistic(key, size, size_diff, count, count_diff)`` class:

    Statistic on memory allocations.

    ``GroupedStats.compare_to()``  and ``GroupedStats.statistics()``
    return a list of ``Statistic`` instances.

``key`` attribute:

    Key identifying the statistic. The key type depends on
    ``GroupedStats.key_type``, see the ``Snapshot.group_by()`` method.


``count`` attribute:

    Number of memory blocks (``int``).

``count_diff`` attribute:

    Difference of number of memory blocks (``int``).

``size`` attribute:

    Total size of memory blocks in bytes (``int``).

``size_diff`` attribute:

    Difference of total size of memory blocks in bytes (``int``).


Prior Work
==========

* `Python Memory Validator
  <http://www.softwareverify.com/python/memory/index.html>`_ (2005-2013):
  commercial Python memory validator developed by Software Verification.
  It uses the Python Reflection API.
* `PySizer <http://pysizer.8325.org/>`_: Google Summer of Code 2005 project by
  Nick Smallbone.
* `Heapy
  <http://guppy-pe.sourceforge.net/>`_ (2006-2013):
  part of the Guppy-PE project written by Sverker Nilsson.
* Draft PEP: `Support Tracking Low-Level Memory Usage in CPython
  <http://svn.python.org/projects/python/branches/bcannon-sandboxing/PEP.txt>`_
  (Brett Canon, 2006)
* Muppy: project developed in 2008 by Robert Schuppenies.
* `asizeof <http://code.activestate.com/recipes/546530/>`_:
  a pure Python module to estimate the size of objects by Jean
  Brouwers (2008).
* `Heapmonitor <http://www.scons.org/wiki/LudwigHaehne/HeapMonitor>`_:
  It provides facilities to size individual objects and can track all objects
  of certain classes. It was developed in 2008 by Ludwig Haehne.
* `Pympler <http://code.google.com/p/pympler/>`_ (2008-2011):
  project based on asizeof, muppy and HeapMonitor
* `objgraph <http://mg.pov.lt/objgraph/>`_ (2008-2012)
* `Dozer <https://pypi.python.org/pypi/Dozer>`_: WSGI Middleware version
  of the CherryPy memory leak debugger, written by Marius Gedminas (2008-2013)
* `Meliae
  <https://pypi.python.org/pypi/meliae>`_:
  Python Memory Usage Analyzer developed by John A Meinel since 2009
* `caulk <https://github.com/smartfile/caulk/>`_: written by Ben Timby in 2012
* `memory_profiler <https://pypi.python.org/pypi/memory_profiler>`_:
  written by Fabian Pedregosa (2011-2013)

See also `Pympler Related Work
<http://pythonhosted.org/Pympler/related.html>`_.


Links
=====

tracemalloc:

* `#18874: Add a new tracemalloc module to trace Python
  memory allocations <http://bugs.python.org/issue18874>`_
* `pytracemalloc on PyPI
  <https://pypi.python.org/pypi/pytracemalloc>`_

Copyright
=========

This document has been placed in the public domain.
_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com


More information about the Python-Dev mailing list