[pypy-dev] Change to the frontpage of speed.pypy.org

Miquel Torres tobami at googlemail.com
Tue Mar 8 18:17:17 CET 2011

you mean this timeline, right?:

Because the December 22 result is so high, the yaxis maximum goes up
to 2.5, thus having less space for the more interesting < 1 range,

Regarding mozilla, do you mean this site?: http://arewefastyet.com/
I can see their timelines have some holes, probably failed runs...

I see a problem with the approach you suggest. Entering an arbitrary
maximum yaxis number is not a good thing. I think the onus is there on
the benchmark infrastructure to not send results that aren't
statistically significant. See Javastats
(http://www.elis.ugent.be/en/JavaStats), or ReBench

Something that can be done on the Codespeed side is to treat
differently points that have a too high stddev. In the aforementioned
spectral-norm timeline, the stddev "floor" is around 0.0050, while the
spike has a 0.30 stddev, much higher. A "strict" mode could be
implemented that invalidates or hides statistically unsound data.

Btw., I had written to the arewefastyet guys about the possibility of
configuring a Codespeed instance for them. We may yet see
collaboration there ;-)


2011/3/8 Maciej Fijalkowski <fijall at gmail.com>:
> On Tue, Mar 8, 2011 at 8:14 AM, Laura Creighton <lac at openend.se> wrote:
>> In a message of Tue, 08 Mar 2011 09:10:32 +0100, Miquel Torres writes:
>>>I finished the changes to the speed.pypy.org home page last night, but
>>>alas!, I didn't have time to deploy. I will do it later today and will
>>>then ping you back.
>>>The extra info provided is really nice as an overview, you will see ;-)
>> Ah good.  Thank you very much.  We spent yesterday afternoon with
>> the Mozilla engineers, and I got to talk to the person who maintains
>> the benchmarks for tracemonkey.  He had timelines very much like ours.
>> There is one feature he has that I would like to have.  Take a look
>> at    the timeline for spectral.norm.  There are two spikes there.
>> Mozilla has lines like that too, though mostly it is because their
>> jit decides that the whole benchmark is bogus and optimises out all the
>> code.  So it takes 0 time.  oops.
>> At any rate, aside from knowing that something went horribly wrong with
>> that rev, you don't really need to know how wrong.  And by making the
>> graph display up to that point means that the dots where things really
>> do matter get crammed closer together than would otherwise be the case.
>> So he had a mode where things wehre displayed with an arbitrary value
>> at the bottom (in our coase it would be the top) which he could specify.
>> Then the graph would be replotted, with the outliers off the graph, but
>> making it easier to read the dots for the more normal cases.
>> Any chance we could do that too?
> Link maybe?
>> Laura
>> _______________________________________________
>> pypy-dev at codespeak.net
>> http://codespeak.net/mailman/listinfo/pypy-dev

More information about the Pypy-dev mailing list