[pypy-dev] speed.pypy.org launched
Carl Friedrich Bolz
cfbolz at gmx.de
Fri Feb 26 13:13:15 CET 2010
On 02/26/2010 12:30 PM, Miquel Torres wrote:
> The paper is right, and the unladen swallow runner does the right thing.
> What I meant was: use the statistically right method (like we are
> doing now!), but don't show deviation bars if the deviation is
> acceptable. Check after the run whether the deviation is not
> "acceptable". If it isn't, rerun later, check that nothing in the
> background is affecting performance, reevaluate reproducibility of the
> given benchmark, etc.
I think an important point is that the deviations don't need to come
from anything in the background. We don't use threads in the benchmarks
(yet) which would obviously insert non-determinism, but even currently
there is enough randomness in the interpreter itself. The GC can start
at different points, the JIT could decide (late in the process) that
something else should be compiled, there are cache-effects, etc. This
randomness is not a bad thing, but I think we should try to at least
evaluate it, by showing the error bars. We should do that even if the
errors are small, because that is a good result worth mentioning.
I guess around 20 or even 10 years ago you could attribute a "correct"
running time to a program, but nowadays there is noise on all levels of
the system and it is not really possible to ignore that. Also, there are
really a lot more levels too :-).
> But it doesn't change the fact that speed could save the deviation
> data for later use.
More information about the Pypy-dev