Andrew Friedley wrote:
David Cournapeau wrote:
Francesc Alted wrote:
No, that seems good enough. But maybe you can present results in cycles/item. This is a relatively common unit and has the advantage that it does not depend on the frequency of your cores.
Sure, cycles is fine, but I'll argue that in this case the number still does depend on the frequency of the cores, particularly as it relates to the frequency of the memory bus/controllers. A processor with a higher clock rate and higher multiplier may show lower performance when measuring in cycles because the memory bandwidth has not necessarily increased, only the CPU clock rate. Plus between say a xeon and opteron you will have different SSE performance characteristics. So really, any sole number/unit is not sufficient without also describing the system it was obtained on :)
Yes, that's why people usually add the CPU type with the cycles/operation count :) It makes comparison easier. Sure, the comparison is not accurate because differences in CPU may make a difference. But with cycles/computation, we could see right away that something was strange with the numpy timing, so I think it is a better representation for discussion/comoparison.
I can do minimum. My motivation for average was to show a common-case performance an application might see. If that application executes the ufunc many times, the performance will tend towards the average.
The rationale for minimum is to remove external factors like other tasks taking CPU, etc...
I was waiting for someone to bring this up :) I used an implementation that I'm now thinking is not accurate enough for scientific use. But the question is, what is a concrete measure for determining whether some cosine (or other function) implementation is accurate enough?
Nan/inf/zero handling should be tested for every function (the exact behavior for standard functions is part of the C standard), and then, the particular values depend on the function and implementation. If your implementation has several codepath, each codepath should be tested. But really, most implementations just test for a few more or less random known values. I know the GNU libc has some tests for the math library, for example. For single precision, brute force testing against a reference implementation for every possible input is actually feasible, too :) David