[Python-Dev] s/hotshot/lsprof

Mon Nov 21 12:14:30 CET 2005

Hi Tim,

On Sun, Nov 20, 2005 at 08:55:49PM -0500, Tim Peters wrote:
> We should note that hotshot didn't intend to reduce total time
> overhead.  What it's aiming at here is to be less disruptive (than
> profile.py) to the code being profiled _while_ that code is running. 

> hotshot tries to stick with tiny little C functions that pack away a
> tiny amount of data each time, and avoid memory alloc/dealloc, to try
> to minimize this disruption.  It looked like it was making real
> progress on this at one time ;-)

I see the point.  I suppose that we can discuss if hotshot is really
nicer on the D cache, as it produces a constant stream of data, whereas
classical profilers like lsprof would in the common case only update a
few counters in existing data structures.  I can tweak lsprof a bit
more, though -- there is a malloc on each call, but it could be avoided.

Still, people generally agree that profile.py, while taking a longer
time overall, gives more meaningful results than hotshot.  Now Brett's
student, Floris, extended hotshot to allow custom timers.  This is
essential, because it enables testing.  The timing parts of hotshot were
not tested previously.

Given the high correlation between untestedness and brokenness, you bet
that Floris' adapted test_profile for hotshot gives wrong numbers.  (My
guess is that Floris overlooked that test_profile was an output test, so
he didn't compare the resulting numbers with the expected ones.)
Looking at the errors in the numbers pointed us immediately to the bug
in the C code.  Some time intervals are lost: the ones before an
exception is raised or a C function is called or returns.  That's a lot
of them.  The current hotshot is hence not so much a profiler than "a
reflection on the meaning of time" (quoting Samuele).

> Ya, hotshot isn't finished.  It had corporate support for its initial
> development, but lost that, and became an orphan then.

I will check in the bug fix for hotshot, but the question is what's the
point.  I would argue that lsprof even with children call stats is much
simpler than hotshot.  Lines-of-code also reflect that (factor of 2).
Obviously hotshot can do much more (undocumented, unmaintained) things
beside profiling if you get the correct tools.  This plays in favour of
lsprof as a stdlib-integrated useful-for-common-people maintained piece
of software and hotshot as distributed together with the tools that can
use its full potential.

A bientot,

Armin.