[Python-Dev] RFC: PEP 454: Add a new tracemalloc module

Wed Sep 4 01:56:21 CEST 2013

> ``get_object_trace(obj)`` function:
>
>     Get the trace of a Python object *obj* as a ``trace`` instance.
>
>     Return ``None`` if the tracemalloc module did not save the location
>     when the object was allocated, for example if the module was
>     disabled.

This function and get_traces() can be reused by other debug tools like
Heapy and objgraph to add where objects were allocated.

> ``get_stats()`` function:
>
>     Get statistics on Python memory allocations per Python filename and
>     per Python line number.
>
>     Return a dictionary
>     ``{filename: str -> {line_number: int -> stats: line_stat}}``
>     where *stats* in a ``line_stat`` instance. *filename* and
>     *line_number* can be ``None``.
>
>     Return an empty dictionary if the tracemalloc module is disabled.
>
> ``get_traces(obj)`` function:
>
>    Get all traces of a Python memory allocations.
>    Return a dictionary ``{pointer: int -> trace}`` where *trace*
>    is a ``trace`` instance.
>
>    Return an empty dictionary if the ``tracemalloc`` module is disabled.

get_stats() can computed from get_traces(), example:
-----
import pprint, tracemalloc

traces = tracemalloc.get_traces()
stats = {}
for trace in traces.values():
    if trace.filename not in stats:
        stats[trace.filename] = line_stats = {}
    else:
        line_stats = stats[trace.filename]
    if trace.lineno not in line_stats:
        line_stats[trace.lineno] = line_stat = tracemalloc.line_stat((0, 0))
        size = trace.size
        count = 1
    else:
        line_stat = line_stats[trace.lineno]
        size = line_stat.size + trace.size
        count = line_stat.count + 1
    line_stats[trace.lineno] = tracemalloc.line_stat((size, count))

pprint.pprint(stats)
-----

The problem is the efficiency. At startup, Python already allocated
more than 20,000 memory blocks:

$ ./python -X tracemalloc -c 'import tracemalloc;
print(len(tracemalloc.get_traces()))'
21704

At the end of the Python test suite, Python allocated more than
500,000 memory blocks.

Storing all these traces in a snapshot eats a lot of memory, disk
space and uses CPU to build the statistics.

> ``start_timer(delay: int, func: callable, args: tuple=(), kwargs:
> dict={})`` function:
>
>     Start a timer calling ``func(*args, **kwargs)`` every *delay*
>     seconds. (...)
>
>     If ``start_timer()`` is called twice, previous parameters are
>     replaced.  The timer has a resolution of 1 second.
>
>     ``start_timer()`` is used by ``DisplayTop`` and ``TakeSnapshot`` to
>     run regulary a task.

So DisplayTop and TakeSnapshot cannot be used at the same time. It
would be convinient to be able to register more than one function.
What do you think?

> ``trace`` class:
>     This class represents debug information of an allocated memory block.
>
> ``size`` attribute:
>     Size in bytes of the memory block.
> ``filename`` attribute:
>     Name of the Python script where the memory block was allocated,
>     ``None`` if unknown.
> ``lineno`` attribute:
>     Line number where the memory block was allocated, ``None`` if
>     unknown.

I though twice and it would be posible to store more than 1 frame per
trace instance, to be able to rebuild a (partial) Python traceback.
The hook on the memory allocator has access to the chain of Python
frames. The API should be changed to support such enhancement.

> ``DisplayTop(count: int=10, file=sys.stdout)`` class:
>     Display the list of the *count* biggest memory allocations into
>     *file*.
> (...)
> ``group_per_file`` attribute:
>
>     If ``True``, group memory allocations per Python filename. If
>     ``False`` (default value), group allocation per Python line number.

This attribute is very important. We may add it to the constructor.

By the way, the self.stream attribute is not documented.

> Snapshot class
> --------------
>
> ``Snapshot()`` class:
>
>     Snapshot of Python memory allocations.
>
>     Use ``TakeSnapshot`` to take regulary snapshots.
>
> ``create(user_data_callback=None)`` method:
>
>     Take a snapshot. If *user_data_callback* is specified, it must be a
>     callable object returning a list of
>     ``(title: str, format: str, value: int)``.
>     *format* must be ``'size'``. The list must always have the same
>     length and the same order to be able to compute differences between
>     values.
>
>     Example: ``[('Video memory', 'size', 234902)]``.

(Oops, create() is a class method, not a method.)

Having to call a class method to build an instance of a class is
surprising. But I didn't find a way to implement the load() class
method otherwise.

The user_data_callback API can be improved. The "format must be size"
is not very convinient.

Victor