[Python-Dev] PEP 454: add a new tracemalloc module (second round)

Tue Sep 17 12:36:03 CEST 2013

2013/9/17 Victor Stinner <victor.stinner at gmail.com>:
> Issue tracking the implementation:
> http://bugs.python.org/issue18874

If you want to test the implementation, you can try the following repository:
http://hg.python.org/features/tracemalloc

Or try the patch attached on the issue #18874 on the Python default
version. Compile Python and use "-X tracemalloc" command line option
to enable the module at startup. Then you can play with
tracemalloc.DisplayTopTask.display() and
tracemalloc.TakeSnapshot.take_snapshot().

To create Python snapshots easily, modify Lib/test/regrtest.py to use
the following block:

take = tracemalloc.TakeSnapshot()
# tracemalloc.set_traceback_limit(15); take.with_traces = True
take.filename_template = "/tmp/tracemalloc-$pid-$counter.pickle"
take.start(10)

And then start:

./python -X tracemalloc -m test

"take.with_traces = True" is slower but required if you want to use
cumulative views, show the traceback, or group traces by address. Use
a lower traceback limit if the test suite is too slow.

When you get a snapshot file, you can analyze it using:

./python -m tracemalloc /tmp/tracemalloc-1564-0001.pickle

Add --help to see all options. I like the --traceback option.

> ``get_filters()`` function:
>
>     Get the filters on Python memory allocations as list of ``Filter``
>     instances.

I hesitate to add a Filters class which would contain a list of
filters. The logic to check if list of filters matchs is non-trivial.
You have to split inclusive and exclusive filters and take care of
empty list of inclusive/exclusive filters. See the code of
Snapshot.apply_filters() for example.

> ``get_object_trace(obj)`` function:
>
>     Get the trace of a Python object *obj* as a ``Trace`` instance.
>
>     The function only returns the trace of the memory block directly
>     holding to object. The ``size`` attribute of the trace is smaller
>     than the total size of the object if the object is composed of more
>     than one memory block.
>
>     Return ``None`` if the ``tracemalloc`` module did not trace the
>     allocation of the object.
>
>     See also ``gc.get_referrers()`` and ``sys.getsizeof()`` functions.

The function can be see of a lie because it does not count all bytes
of a object (as explained in the doc above). The function should maybe
be renamed to "get_trace(address)" to avoid the confusion.

> DisplayTop class
> ----------------

Oh, I forgot to document the new "previous_top_stats" attribute. It is
used to compare two snapshots.

> DisplayTopTask class
> --------------------
>
> ``start(delay: int)`` method:
>
>     Start a task using the ``start_timer()`` function calling the
>     ``display()`` method every *delay* seconds.

I should probably repeat here that only one timer can used at the same
time. So only one DisplayTopTask or one TakeSnapshot instance can be
used at the same time.

It's a design choice to keep start_timer() simple, there is no need
for a complex scheduler for such simple debug tool.

You *can* run the two tasks at the same time by writing your own function:

def mytask(top_task, snapshot_task):
    top_task.display()
    snapshot_task.take_snapshot()

tracemalloc.start_timer(10, mytask, top_task, snapshot_task)

> Snapshot class
> --------------
>
> ``create(\*, with_traces=False, with_stats=True,
> user_data_callback=None)`` classmethod:

It's the only function using keyword-only parameters. I don't know
it's a good practice and should be used on other methods, or if it
should be avoided?

>     *user_data_callback* is an optional callable object. Its result
>     should be serializable by the ``pickle`` module, or
>     ``Snapshot.write()`` would fail.  If *user_data_callback* is set, it
>     is called and the result is stored in the ``Snapshot.user_data``
>     attribute. Otherwise, ``Snapshot.user_data`` is set to ``None``.

The idea is to attach arbitrary data to a snapshot. Examples:

* size of Python caches: cache of linecache and re modules
* size of the internal Unicode intern dict
* gc.get_stats()
* gc.get_count()
* len(gc.get_objects())
* ("tracemalloc_size" should maybe moved to the user_data)

I hesitate to use a dictionary for user_data. The problem is to decice how to
display such data in DisplayTop. For example, gc.get_count() is a number
whereas tracemalloc_size is size is bytes (should be formatted using kB, MB,
etc. suffixes).

What do you think?

Victor