[pypy-dev] Change to the frontpage of speed.pypy.org
tobami at googlemail.com
Tue Mar 8 18:17:17 CET 2011
you mean this timeline, right?:
Because the December 22 result is so high, the yaxis maximum goes up
to 2.5, thus having less space for the more interesting < 1 range,
Regarding mozilla, do you mean this site?: http://arewefastyet.com/
I can see their timelines have some holes, probably failed runs...
I see a problem with the approach you suggest. Entering an arbitrary
maximum yaxis number is not a good thing. I think the onus is there on
the benchmark infrastructure to not send results that aren't
statistically significant. See Javastats
(http://www.elis.ugent.be/en/JavaStats), or ReBench
Something that can be done on the Codespeed side is to treat
differently points that have a too high stddev. In the aforementioned
spectral-norm timeline, the stddev "floor" is around 0.0050, while the
spike has a 0.30 stddev, much higher. A "strict" mode could be
implemented that invalidates or hides statistically unsound data.
Btw., I had written to the arewefastyet guys about the possibility of
configuring a Codespeed instance for them. We may yet see
collaboration there ;-)
2011/3/8 Maciej Fijalkowski <fijall at gmail.com>:
> On Tue, Mar 8, 2011 at 8:14 AM, Laura Creighton <lac at openend.se> wrote:
>> In a message of Tue, 08 Mar 2011 09:10:32 +0100, Miquel Torres writes:
>>>I finished the changes to the speed.pypy.org home page last night, but
>>>alas!, I didn't have time to deploy. I will do it later today and will
>>>then ping you back.
>>>The extra info provided is really nice as an overview, you will see ;-)
>> Ah good. Thank you very much. We spent yesterday afternoon with
>> the Mozilla engineers, and I got to talk to the person who maintains
>> the benchmarks for tracemonkey. He had timelines very much like ours.
>> There is one feature he has that I would like to have. Take a look
>> at the timeline for spectral.norm. There are two spikes there.
>> Mozilla has lines like that too, though mostly it is because their
>> jit decides that the whole benchmark is bogus and optimises out all the
>> code. So it takes 0 time. oops.
>> At any rate, aside from knowing that something went horribly wrong with
>> that rev, you don't really need to know how wrong. And by making the
>> graph display up to that point means that the dots where things really
>> do matter get crammed closer together than would otherwise be the case.
>> So he had a mode where things wehre displayed with an arbitrary value
>> at the bottom (in our coase it would be the top) which he could specify.
>> Then the graph would be replotted, with the outliers off the graph, but
>> making it easier to read the dots for the more normal cases.
>> Any chance we could do that too?
> Link maybe?
>> pypy-dev at codespeak.net
More information about the Pypy-dev