Maciej Fijalkowski, 01.10.2012 08:47:
On Mon, Oct 1, 2012 at 2:04 AM, Antoine Pitrou wrote:
On Sun, 30 Sep 2012 19:58:02 -0400 Brett Cannon wrote:
On Sun, Sep 30, 2012 at 7:21 PM, Antoine Pitrou wrote:
The hexiom benchmark is very slow. Is there a reason it's included there?
Already been asked and answered: http://mail.python.org/pipermail/speed/2012-September/000209.html
I didn't realize when reading this discussion that hexiom2 was *that* slow. I don't think a benchmark taking 100+ seconds to run *in fast mode* has a place in the benchmark suite. PyPy can maintain their own benchmarks in their source tree, like CPython does.
I strongly disagree. There are quite a few slow benchmarks that are very useful, like pypy translation toolchain. How about you skip this one in fast mode?
I think the basic question here is: what good is a benchmark that takes ages to run?
Either it benchmarks mostly a specific part of the overall runtime, in that case, there's no reason for it to take ages. It can just be stripped down to run a reasonably sized workload that exercises the target code well.
If it does not benchmark anything specific, and thus cannot be sized down in any way, then what's the interesting thing that its broad and unspecific result would tell us?
I think it's either a matter of investing some time into retailoring the work load of that benchmark, or, if that's not feasible, consider if dropping it isn't a better solution.
Stefan
Hi Stefan,
On Mon, Oct 1, 2012 at 9:14 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Either it benchmarks mostly a specific part of the overall runtime, in that case, there's no reason for it to take ages. It can just be stripped down to run a reasonably sized workload that exercises the target code well.
If it does not benchmark anything specific, and thus cannot be sized down in any way, then what's the interesting thing that its broad and unspecific result would tell us?
Broad and unspecific results are what is most interesting from the point of view of the end user using a large program.
For example, changes to the cycle collector of CPython, as occurred between 2.6 and 2.7, had good results on large and complicated programs but I guess not on anything typically benchmarkish.
As a result, at least one "large and complicated" program runs quite a lot faster on CPython. This large program is included into our benchmark suite. Feel free to throw it away on first principles, but we would disagree with your approach.
A bientôt,
Armin.
participants (2)
-
Armin Rigo
-
Stefan Behnel