Hi folks, Every time I do some Python profiling (with cProfile) the API feels kinda baroque, and I have to write ~10 line helper functions to do what I want. For example, if I want the return value of running some function and the profiling output as a string (e.g., to send as part of a web response), I have to do something like this: import cProfile import io import pstats def profile(func, *args, **kwargs): profiler = cProfile.Profile() result = profiler.runcall(func, *args, **kwargs) stream = io.StringIO() stats = pstats.Stats(profiler, stream=stream) stats.sort_stats('cumulative') stats.print_stats(strip_dirs=True) return (result, stream.getvalue()) Something like that might be a useful function in its own right, but I took a look at the current API, and also an open issue that addresses some of this (https://bugs.python.org/issue9285 last touched almost 4 years ago), and came up with the following: 1) Add a format_stats() function to Profile to give the profiling results as a string (kind of parallel to format_* vs print_* functions in the "traceback" module). Signature would format_stats(self, *restrictions, sort='stdname', strip_dirs=False). 2) Add __enter__ and __exit__ to Profile so you can use it in a "with" statement. 3) Add a top-level runcall() function to the cProfile (and profile) modules. This isn't particularly useful for me, but it'd make the module easier to use from the command line, and it's one of the API improvements over at issue 9285. Other things in issue 9285 that I don't think are a good idea: the separate runblock() context manager (there should be only one way to do it, and I think "with Profile()" is great), and the @profile decorator (I really don't see the use of this -- you don't always want to profile a function when calling it, only it certain contexts). My code implementing the above three things as a separate module (subclassing Profile) is copied below. I think the above additions are non-controversial improvements -- do you think I should make a patch to get this into the standard library? New issue or add it to 9285? Another feature that I wanted and would be useful for a lot of folks, I think, would be some way to fetch the results as proper Python objects, rather than as a file/string. Maybe a Profile.get_results() function that returns a ProfileResult namedtuple which has total time and number of calls etc, as well as a list of ProfileEntry namedtuples that have the data for each function. Thoughts on that (before any bike-shedding on the exact API)? Currently folks who need this data have to use undocumented parts of Profile like .stats and .fcn_list, for example the Pyramid debug toolbar extracts this data so it can render it in an HTML web page: https://github.com/Pylons/pyramid_debugtoolbar/blob/ed406d7f3c8581458c2e7bdf... Thanks! Ben # BELOW is my profileutils.py module """Code profiling utilities.
def foo(n): ... return sum(sum(range(i)) for i in range(n))
with Profile() as profiler: ... result = foo(5) print(profiler.format_stats(3, sort='nfl')) # doctest: +ELLIPSIS 15 function calls (10 primitive calls) in 0.000 seconds <BLANKLINE> Ordered by: name/file/line List reduced from 5 to 3 due to restriction <3> <BLANKLINE> ncalls tottime percall cumtime percall filename:lineno(function) 6/1 0.000 0.000 0.000 0.000 {built-in method builtins.sum} 6 0.000 0.000 0.000 0.000 <...>:2(<genexpr>) 1 0.000 0.000 0.000 0.000 {method 'disable' ...} <BLANKLINE> <BLANKLINE> <BLANKLINE> result 10
result = runcall(foo, 10) # doctest: +ELLIPSIS 24 function calls (14 primitive calls) in 0.000 seconds <BLANKLINE> Ordered by: standard name <BLANKLINE> ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.000 0.000 <...>:1(foo) 11 0.000 0.000 0.000 0.000 <...>:2(<genexpr>) 11/1 0.000 0.000 0.000 0.000 {built-in method builtins.sum} 1 0.000 0.000 0.000 0.000 {method 'disable' ...} <BLANKLINE> <BLANKLINE> result 120 """
import cProfile import io import pstats VALID_SORTS = [ 'calls', 'cumtime', 'cumulative', 'file', 'filename', 'line', 'module', 'name', 'ncalls', 'nfl', 'pcalls', 'stdname', 'time', 'tottime', ] class Profile(cProfile.Profile): def format_stats(self, *restrictions, sort='stdname', strip_dirs=False): """Format stats report but return as string instead of printing to stdout. "restrictions" are as per Stats.print_stats(). Entries are sorted as per Stats.sort_stats(sort), and directories are stripped from filenames if strip_dirs is True. """ sort_keys = (sort,) if isinstance(sort, (int, str)) else sort stream = io.StringIO() stats = pstats.Stats(self, stream=stream) stats.sort_stats(*sort_keys) if strip_dirs: stats.strip_dirs() stats.print_stats(*restrictions) return stream.getvalue() # Define __enter__ and __exit__ to enable "with statement" support def __enter__(self): self.enable() return self def __exit__(self, exc_type, exc_value, exc_traceback): self.disable() def runcall(func, *args, **kwargs): """Profile running func(*args, **kwargs) and print stats to stdout.""" profiler = cProfile.Profile() result = profiler.runcall(func, *args, **kwargs) profiler.print_stats() return result if __name__ == '__main__': import doctest doctest.testmod()