It's similar, but not quite the same -- I was just trying to see if I could build a neatly Pythonic library to do the conversion. The CPU profilers are basically building a dict from (filename, lineno, funcname) to a tuple (from a comment in profile.py):

    [0] = The number of times this function was called, not counting direct
          or indirect recursion,
    [1] = Number of times this function appears on the stack, minus one
    [2] = Total time spent internal to this function
    [3] = Cumulative time that this function was present on the stack.  In
          non-recursive functions, this is the total execution time from start
          to finish of each invocation of a function, including time spent in
          all subfunctions.
    [4] = A dictionary indicating for each function name, the number of times
          it was called by us.

pstats serializes this dict in a particular format which various other tools can read, like gprof2dot. The challenge in translating is that building (4), or handling recursion for (0) and (3), really requires instrumentation at the CPU trace points as well, which would probably be a good answer to my original question of why not. :)

However, there are other profiling formats which are used outside the Python community, have good tooling support, and could be much easier to deal with; for example, there's the pprof format, which is almost ludicrously versatile; it's meant for profiling both compiled and interpreted languages, so it's very flexible as to what constitutes a "line." 

So if I have the time, and knowing that there's no intrinsic thing to fear in all of this, I'll see if I can implement a pprof translator for tracemalloc snapshots. 


Although while I have you hear, I do have a further question about how tracemalloc works: If I'm reading the code correctly, traces get removed by tracemalloc when objects are free, which means that at equilibrium (e.g. at the end of a function) the trace would show just the data which leaked. That's very useful in most cases, but I'm trying to hunt down a situation where memory usage is transiently spiking -- which might be due to something being actively used, or to something building up and overwhelming the GC, or to evil elves in the CPU for all I can tell so far. Would it be completely insane for tracemalloc to have a mode where it either records frees separately (e.g. as a malloc of negative space, at the trace where the free is happening), or where it simply ignores frees altogether?

On Thu, Jun 27, 2019 at 3:08 PM Victor Stinner <vstinner@redhat.com> wrote:
Hi,

I designed tracemalloc with Charles-François Natali in PEP 454. The
API is a lightweight abstraction on top of the internal C structures
used by the C _tracemalloc module which is designed to minimize the
memory footprint.

I'm not aware of the pstats format. Adding a new
tracemalloc.dump_pstats() function looks like a good idea. Does pstats
allow to attach arbitrary data to a traceback? The root structure of
tracemalloc is basically the tuple (size: int, traceback) (trace_t
structure in C).

Victor

Le jeu. 27 juin 2019 à 21:03, Yonatan Zunger <zunger@humu.com> a écrit :
>
>
> Hi everyone,
>
> Something occurred to me while trying to analyze code today: profiler and cProfiler emit their data in pstats format, which various tools and libraries consume. tracemalloc, on the other hand, uses a completely separate format which nonetheless contains similar data. In fact, in many non-Python applications I've worked in, heap and CPU profiles were always emitted in identical formats, which allowed things like visual representations of stack traces where memory is allocated, and these have proven quite useful in practice and allowed lots of sharing of tools across many applications.
>
> Is there a particular design reason why these formats are different in Python right now? Would it make sense to consider allowing them to match, e.g. having a tracemalloc.dump_pstats() method?
>
> Yonatan
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-leave@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/3JFFWGJ57LQRZI3CVJXF5P7NYRCEWCJB/



--
Night gathers, and now my watch begins. It shall not end until my death.