[Numpy-discussion] Profiling (was GSoC : Performance parity between numpy arrays and Python scalars)

Sun May 5 19:26:25 EDT 2013

On Sun, May 5, 2013 at 5:57 PM, David Cournapeau <cournape at gmail.com> wrote:
>> perf is a fabulous framework and doesn't have any way to get full
>> callgraph information out so IME it's been useless. They have
>> reporting modes that claim to (like some "fractal" thing?) but AFAI
>> been able to tell from docs/googling/mailing lists, there is nobody
>> who understands how to interpret this output except the people who
>> wrote it. Really a shame that it falls down in the last mile like
>> that, hopefully they will fix this soon.
>
> Perf doc is written for Vulcan, but it does what I think you want, say:
>
> void work(int n) {
>   volatile int i=0; //don't optimize away
>   while(i++ < n);
> }
> void easy() { work(1000 * 1000 * 50); }
> void hard() { work(1000*1000*1000); }
> int main() { easy(); hard(); }
>
> compile with gcc -g -O0, and then:
>
> perf record -g -a -- ./a.out
> perf report -g -a --stdio
>
> gives me
>
>   95.22%            a.out  a.out
>   [.] work
>                       |
>                       --- work
>                          |
>                          |--89.84%-- hard
>                          |          main
>                          |          __libc_start_main
>                          |
>                           --5.38%-- easy
>                                     main
>                                     __libc_start_main
>
>
> or maybe even better with the -G option
>
>  95.22%            a.out  a.out
>  [.] work
>                       |
>                       --- __libc_start_main
>                           main
>                          |
>                          |--94.35%-- hard
>                          |          work
>                          |
>                           --5.65%-- easy
>                                     work
>

Yeah I've seen these displays before and I can see the information is
there, and (knowing the code you ran) that somehow the first number
has to do with the time spent under 'hard' and the second to do with
time spent under 'easy', but I have no idea how to generalize this to
arbitrary samples of these output formats. That's what I meant.

-n