On Mon, Jan 30, 2012 at 13:28, Maciej Fijalkowski email@example.com wrote:
On Mon, Jan 30, 2012 at 7:56 PM, Brett Cannon firstname.lastname@example.org wrote:
On Thu, Jan 26, 2012 at 15:21, Carsten Senger email@example.com
With the help of Maciej I worked on the buildbot in the last days. It can build cpython, run the benchmarks and upload the results to one or more codespeed instances. Maciej will look at the changes so we will hopefully have a working buildbot for python 2.7 in the next days.
This has a ticket in pypy's bugtracker: https://bugs.pypy.org/issue1015
I also have a script we can use to run the benchmarks for parts of the history and get data for a year or so into codespeed. The question is if this data is interesting to anyone.
I would say "don't worry about it unless you have some personal motivation to want to bother". While trending data is interesting, it isn't critical and a year will eventually pass anyway. =)
What are the plans for benchmarking python 3? How much of the benchmark suite will work with python 3, or can be made work without much effort? Porting the runner and the support code is easy, but directly porting the benchmarks including the used libraries seems unrealistic.
Can we replace them with newer versions that support python3 to get some benchmarks working? Or build a second set of python3 compatible benchmarks with these newer versions?
That's an open question. Until the libraries the benchmarks get ported officially then it's up in the air when the pre-existing benchmarks can move. We might have to look at pulling in a new set to start and then add back in the old ones (possibly) as they get ported.
Changing benchmarks is *never* a good idea. Note that we have quite some history of those benchmarks running on pypy and I would strongly object changing them in any way. Adding python 3 versions next to them is much better. Also porting runner etc. is not a very good idea I think.
The problem really is that most of interesting benchmarks don't work on python 3, only the uninteresting ones. What we gonna do about that?
And this is a fundamental issue with tying benchmarks to real applications and libraries; if the code the benchmark relies on never changes to Python 3, then the benchmark is dead in the water. As Daniel pointed out, if spitfire simply never converts then either we need to convert them ourselves *just* for the benchmark (yuck), live w/o the benchmark (ok, but if this happens to a bunch of benchmarks then we are going to not have a lot of data), or we look at making new benchmarks based on apps/libraries that _have_ made the switch to Python 3 (which means trying to agree on some new set of benchmarks to add to the current set).
BTW, which benchmark set are we talking about? speed.pypy.org runs a different set of benchmarks than the ones at http://hg.python.org/benchmarks. What set are we worrying about porting here? If it's the latter we will have to wait a while for at lest 20% of the benchmarks since they rely on Twisted (which is only 50% done according to http://twistedmatrix.com/trac/milestone/Python-3.x).