On 03/10/2010 12:14 PM Miquel Torres wrote:
Hi!
I wanted to explain a couple of things about the speed website:
- New feature: the Timeline view now defaults to a plot grid, showing all benchmarks at the same time. It was a feature request made more than once, so depending on personal tastes, you can bookmark either /overview/ or /timeline/. Thanks go to nsf for helping with the implementation. - The code has now moved to github as Codespeed, a benchmark visualization framework (http://github.com/tobami/codespeed) - I have updated speed.pypy.org with version 0.3. Much of the work has been under the hood to make it feasible for other projects to use codespeed as a framework.
For those interested in further development you can go to the releases wiki (still a work in progress): http://wiki.github.com/tobami/codespeed/releases
Next in the line are some DB changes to be able to save standard deviation data and the like. Long term goals besides world domination are integration with buildbot and similarly unrealistic things. Feedback is always welcome.
Nice looking stuff. But a couple comments: 1. IMO standard deviation is too often worse than useless, since it hides the true nature of the distribution. I think the assumption of normalcy is highly suspect for benchmark timings, and pruning may hide interesting clusters. I prefer to look at scattergrams, where things like clustering and correlations are easily apparent to the eye, as well as the amount of data (assuming a good mapping of density to visuals). 2. IMO benchmark timings are like travel times, comparing different vehicles. (pypy with jit being a vehicle capable of dynamic self-modification ;-) E.g., which part of travel from Stockholm to Paris would you concentrate on improving to improve the overall result? How about travel from Brussels to Paris? Or Paris to Sydney? ;-P Different things come into play in different benchmarks/trips. A Porsche Turbo and a 2CV will both have to wait for a ferry, if that's part of the trip. IOW, it would be nice to see total time broken down somehow, to see what's really happening. Don't get me wrong, the total times are certainly useful indicators of progress (which has been amazing). 3. Speed is ds/dt and you are showing the integral of dt/ds over the trip distance to get time. A 25% improvement in total time is not a 25% improvement in speed. I.e., (if you define improvement as a percentage change in a desired direction), for e.g. 25%: distance/(0.75*time) != 1.25*(distance/time). IMO 'speed' (the implication to me in the name speed.pypy.org) would be benchmarks/time more appropriately than time/benchmark. Both measures are useful, but time percentages are easy to mis{use,construe} ;-) 4. Is there any memory footprint data? Regards, Bengt Richter