[Python-Dev] Updated PEP 454 (tracemalloc): no more metrics!
Victor Stinner
victor.stinner at gmail.com
Thu Oct 31 02:35:25 CET 2013
New update of the PEP combining various remarks:
* Remove GroupedStats class and Snapshot.group_by(): replaced with a
new Snapshot.statistics() method which combines all features
* Rename reset() to clear_traces() and explain how to get traces
before clearing traces
* Snapshot.apply_filters() now returns a new Snapshot instance
* Rename Filter.include to Filter.inclusive
* Rename Filter.traceback to Filter.all_frames
* Add a section "Log calls to the memory allocator"
Thanks Jim, Charles-François and Kristjan for your feedback!
Here is the new section, tell me if it sounds good. I didn't implement
logging just to compare performances. "slower than" is just a previous
experience with a very basic logging code. Tell me if you disagree.
@Kristjan: I understood that you implemented a tool to log calls on a
Playstation3 and send them over the network. How do you process so
much data (I computed 29 GB/hour)? Do you log all calls, or only a few
of them?
Rejected Alternatives
=====================
Log calls to the memory allocator
---------------------------------
A different approach is to log calls to ``malloc()``, ``realloc()`` and
``free()`` functions. Calls can be logged into a file or send to another
computer through the network. Example of a log entry: name of the
function, size of the memory block, address of the memory block, Python
traceback where the allocation occurred, timestamp.
Logs cannot be used directly, getting the current status of the memory
requires to parse previous logs. For example, it is not possible to get
directly the traceback of a Python object, like
``get_object_traceback(obj)`` does with traces.
Python uses objects with a very short lifetime and so makes an extensive
use of memory allocators. It has an allocator optimized for small
objects (less than 512 bytes) with a short lifetime. For example, the
Python test suites calls ``malloc()``, ``realloc()`` or ``free()``
270,000 times per second in average. If the size of log entry is 32
bytes, logging produces 8.2 MB per second or 29.0 GB per hour.
The alternative was rejected because it is less efficient and has less
features. Parsing logs in a different process or a different computer is
slower than maintaining traces on allocated memory blocks in the same
process.
Victor
More information about the Python-Dev
mailing list