Anders Hammarquist wrote:
> Hi all,

Hi Iko, hi all

> So benchmarks
> will obviously have to specify what interpreter(s) they should be run
> by somehow.

I think this is a vital requirement. I can imagine various scenarios where you 
want to specify which interpreters to benchmark, e.g.:

1) benchmarking pypy-cli vs IronPython or pypy-jvm vs Jython

2) benchmarking pypy-cs at different svn revisions

3) benchmarking pypy-c-trunk vs pypy-c-some-branch (maybe with the possibility 
of specifying pypy-c-trunk-at-the-revision-where-the-branch-was-created, to 
avoid noise)

4) benchmarking pypy-cs with different build options

5) bencharmking with profiling enabled (I'd say that profiling should be off 
by default)

> The bigger question is how to get those interpreters. Should running
> the benchmarks also trigger building one (or more) pypy interpreters
> according to specs in the benchmarking framework? (but then if you
> only want it to run one benchmark, you may have to wait for all the
> interpreters to build) Perhaps each benchmark should build its own
> interpreter (though this seems slow, given that most benchmarks
> can probably run on an identically built interpreter).
> Or maybe the installed infrastructure should only care about history,
> and if you want to run a single benchmark, you do that on your own.

Conceptually, I would say that you need to rebuild the required pypys every 
time you run the benchmarks.  Concretely, we can think of putting them into a 
cache, so that if you need a pypy-c that for some reason has already been 
built, you just reuse it.  Moreover, it could be nice if you could select the 
pypy to benchmark from a list of already built pypys, if you want to same time.

Also, we may need to think how to deal with excessive loading: if everyone of 
us tries to run his own set of benchmark, the benchmarking machine could 
become too overloaded to be useful in any sense.


