
Anders Hammarquist wrote:
Hi all,
Hi Iko, hi all [cut]
So benchmarks will obviously have to specify what interpreter(s) they should be run by somehow.
I think this is a vital requirement. I can imagine various scenarios where you want to specify which interpreters to benchmark, e.g.: 1) benchmarking pypy-cli vs IronPython or pypy-jvm vs Jython 2) benchmarking pypy-cs at different svn revisions 3) benchmarking pypy-c-trunk vs pypy-c-some-branch (maybe with the possibility of specifying pypy-c-trunk-at-the-revision-where-the-branch-was-created, to avoid noise) 4) benchmarking pypy-cs with different build options 5) bencharmking with profiling enabled (I'd say that profiling should be off by default)
The bigger question is how to get those interpreters. Should running the benchmarks also trigger building one (or more) pypy interpreters according to specs in the benchmarking framework? (but then if you only want it to run one benchmark, you may have to wait for all the interpreters to build) Perhaps each benchmark should build its own interpreter (though this seems slow, given that most benchmarks can probably run on an identically built interpreter).
Or maybe the installed infrastructure should only care about history, and if you want to run a single benchmark, you do that on your own.
Conceptually, I would say that you need to rebuild the required pypys every time you run the benchmarks. Concretely, we can think of putting them into a cache, so that if you need a pypy-c that for some reason has already been built, you just reuse it. Moreover, it could be nice if you could select the pypy to benchmark from a list of already built pypys, if you want to same time. Also, we may need to think how to deal with excessive loading: if everyone of us tries to run his own set of benchmark, the benchmarking machine could become too overloaded to be useful in any sense. ciao, Anto