
Hi all, there have been pretty big changes under the hood of (they were implemented for some time now, but I didn't have time to migrate the site) in order to accomodate a couple of new features. One of them doesn't affect pypy for the moment, which is allowing revisions to be actually strings (it allows to change to git, for example), and consecuently basing revision ordering on date, rather than on rev number. Changes you may notice are geared towards easing the identification of the cause of performance changes: - Std deviation was added to the DB, overview table and timeline tooltips. This way we can rule out fluctuations in the measuring as a cause for a big change. It has to be pointed out though, that this is only as useful as the way the std dev is being computed. (there you go, Carl Friedrich ;-) - SVN integration: Under the overview table, the commit logs starting after the last tested revision are shown. It should help quickly identify which commits should take the blame for a regression. I hope you find it useful! If anyone thinks about any way to improve on those two features please say so. Cheers, Miquel

Hi Miquel, On 04/20/2010 07:14 PM, Miquel Torres wrote:
Changes you may notice are geared towards easing the identification of the cause of performance changes: - Std deviation was added to the DB, overview table and timeline tooltips. This way we can rule out fluctuations in the measuring as a cause for a big change. It has to be pointed out though, that this is only as useful as the way the std dev is being computed. (there you go, Carl Friedrich ;-)
Yay, that's incredibly cool! Let's hope that eventually the graph library supports errors too, so we can add them graphically. Anyway, I already found some fun things about the benchmarks, so thanks a lot! (e.g. the std dev of chaos is very large, which is not really a good thing)
- SVN integration: Under the overview table, the commit logs starting after the last tested revision are shown. It should help quickly identify which commits should take the blame for a regression.
That's a very clever idea. Cheers, Carl Friedrich

Hi Carl, Hi Miquel. Cool job! On Tue, Apr 20, 2010 at 12:29 PM, Carl Friedrich Bolz <cfbolz@gmx.de> wrote:
Hi Miquel,
On 04/20/2010 07:14 PM, Miquel Torres wrote:
Changes you may notice are geared towards easing the identification of the cause of performance changes: - Std deviation was added to the DB, overview table and timeline tooltips. This way we can rule out fluctuations in the measuring as a cause for a big change. It has to be pointed out though, that this is only as useful as the way the std dev is being computed. (there you go, Carl Friedrich ;-)
Yay, that's incredibly cool! Let's hope that eventually the graph library supports errors too, so we can add them graphically. Anyway, I already found some fun things about the benchmarks, so thanks a lot! (e.g. the std dev of chaos is very large, which is not really a good thing)
Of course it's large, because as mentioned above the way we compute it doesn't make sense. We have an average over consecutive runs which include warmup (less and less so over the course). Cheers, fijal

Hi Maciek, On 04/20/2010 08:41 PM, Maciej Fijalkowski wrote:
(e.g. the std dev of chaos is very large, which is not really a good thing)
Of course it's large, because as mentioned above the way we compute it doesn't make sense. We have an average over consecutive runs which include warmup (less and less so over the course).
Hm, you're right. Seems chaos is really not doing any warmup, which is of course silly. I guess we should do something systematic about warmup at some point soon, also because it affects the blackhole-improvements Armin is working on. Cheers, Carl Friedrich

Yeah, that can be seen in the fact that chaos's std dev for pypy-c is not as large as for pypy-c-jit, in fact it is perfectly normal. Btw. for easily spotting big std dev values maybe they should be highlighted in red. What would a maximum reasonable value for std dev would be (compared to total time)? 2010/4/20 Maciej Fijalkowski <fijall@gmail.com>
Hi Carl, Hi Miquel.
Cool job!
Hi Miquel,
On 04/20/2010 07:14 PM, Miquel Torres wrote:
Changes you may notice are geared towards easing the identification of the cause of performance changes: - Std deviation was added to the DB, overview table and timeline tooltips. This way we can rule out fluctuations in the measuring as a cause for a big change. It has to be pointed out though, that this is only as useful as the way the std dev is being computed. (there you go, Carl Friedrich ;-)
Yay, that's incredibly cool! Let's hope that eventually the graph library supports errors too, so we can add them graphically. Anyway, I already found some fun things about the benchmarks, so thanks a lot! (e.g. the std dev of chaos is very large, which is not really a good
On Tue, Apr 20, 2010 at 12:29 PM, Carl Friedrich Bolz <cfbolz@gmx.de> wrote: thing)
Of course it's large, because as mentioned above the way we compute it doesn't make sense. We have an average over consecutive runs which include warmup (less and less so over the course).
Cheers, fijal _______________________________________________ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
participants (3)
-
Carl Friedrich Bolz
-
Maciej Fijalkowski
-
Miquel Torres