[pypy-dev] Change to the frontpage of speed.pypy.org

holger krekel holger at merlinux.eu
Tue Mar 8 20:27:31 CET 2011


On Tue, Mar 08, 2011 at 20:20 +0100, Miquel Torres wrote:
> Ok, I just committed the changes.
> 
> They address two general cases:
> - You want to know how fast PyPy is *now* compared to CPython in
> different benchmark scenarios, or tasks.
> - You want to know how PyPy has been *improving* overall over the last releases
> 
> That is now answered on the front page, and the reports are now much
> less prominent (I didn't change the logic because it is something I
> want to do properly, not just as a hack for speed.pypy).
> - I have not yet addressed the "smaller is better" point.

Great work, Miquel.  I like it!

> I am aware that the wording of the "faster on average" needs to be
> improved (I am discussing it with Holger even now ;). Please chime in
> so that we can have a good paragraph that is informative and short
> enough while at the same time not being misleading.

Maybe something that reads like this::

    On average, PyPy trunk runs the benchmarks 3.3 times faster than CPython.
    The average is computed as the geometric mean of all benchmark timings.

Idea is to avoid the notion that "pypy is 3.3. times faster than cpython"
at this point because the benchmarks are not yet a balanced selection of
real life use cases.

best,
holger

> Miquel
> 
> 
> 2011/3/8 Miquel Torres <tobami at googlemail.com>:
> > you mean this timeline, right?:
> > http://speed.pypy.org/timeline/?ben=spectral-norm
> >
> > Because the December 22 result is so high, the yaxis maximum goes up
> > to 2.5, thus having less space for the more interesting < 1 range,
> > right?
> >
> > Regarding mozilla, do you mean this site?: http://arewefastyet.com/
> > I can see their timelines have some holes, probably failed runs...
> >
> > I see a problem with the approach you suggest. Entering an arbitrary
> > maximum yaxis number is not a good thing. I think the onus is there on
> > the benchmark infrastructure to not send results that aren't
> > statistically significant. See Javastats
> > (http://www.elis.ugent.be/en/JavaStats), or ReBench
> > (https://github.com/smarr/ReBench).
> >
> > Something that can be done on the Codespeed side is to treat
> > differently points that have a too high stddev. In the aforementioned
> > spectral-norm timeline, the stddev "floor" is around 0.0050, while the
> > spike has a 0.30 stddev, much higher. A "strict" mode could be
> > implemented that invalidates or hides statistically unsound data.
> >
> > Btw., I had written to the arewefastyet guys about the possibility of
> > configuring a Codespeed instance for them. We may yet see
> > collaboration there ;-)
> >
> > Miquel
> >
> >
> > 2011/3/8 Maciej Fijalkowski <fijall at gmail.com>:
> >> On Tue, Mar 8, 2011 at 8:14 AM, Laura Creighton <lac at openend.se> wrote:
> >>> In a message of Tue, 08 Mar 2011 09:10:32 +0100, Miquel Torres writes:
> >>>>Hi,
> >>>>
> >>>>I finished the changes to the speed.pypy.org home page last night, but
> >>>>alas!, I didn't have time to deploy. I will do it later today and will
> >>>>then ping you back.
> >>>>
> >>>>The extra info provided is really nice as an overview, you will see ;-)
> >>>>
> >>>>
> >>>
> >>> Ah good.  Thank you very much.  We spent yesterday afternoon with
> >>> the Mozilla engineers, and I got to talk to the person who maintains
> >>> the benchmarks for tracemonkey.  He had timelines very much like ours.
> >>> There is one feature he has that I would like to have.  Take a look
> >>> at    the timeline for spectral.norm.  There are two spikes there.
> >>> Mozilla has lines like that too, though mostly it is because their
> >>> jit decides that the whole benchmark is bogus and optimises out all the
> >>> code.  So it takes 0 time.  oops.
> >>>
> >>> At any rate, aside from knowing that something went horribly wrong with
> >>> that rev, you don't really need to know how wrong.  And by making the
> >>> graph display up to that point means that the dots where things really
> >>> do matter get crammed closer together than would otherwise be the case.
> >>> So he had a mode where things wehre displayed with an arbitrary value
> >>> at the bottom (in our coase it would be the top) which he could specify.
> >>> Then the graph would be replotted, with the outliers off the graph, but
> >>> making it easier to read the dots for the more normal cases.
> >>>
> >>> Any chance we could do that too?
> >>
> >> Link maybe?
> >>
> >>>
> >>> Laura
> >>> _______________________________________________
> >>> pypy-dev at codespeak.net
> >>> http://codespeak.net/mailman/listinfo/pypy-dev
> >>>
> >>
> >
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
> 



More information about the Pypy-dev mailing list