Cython's view on a common benchmark suite (was: Re: Buildbot Status)
Brett Cannon, 01.02.2012 18:25:
to prevent this from either ending up in a dead-end because of this, we need to first decide where the canonical set of Python VM benchmarks are going to live. I say hg.python.org/benchmarks for two reasons. One is that Antoine has already done work there to port some of the benchmarks so there is at least some there that are ready to be run under Python 3 (and the tooling is in place to create separate Python 2 and Python 3 benchmark suites). Two, this can be a test of having the various VM contributors work out of hg.python.org if we are ever going to break the stdlib out for shared development. At worst we can simply take the changes made at pypy/benchmarks that apply to just the unladen benchmarks that exists, and at best merge the two sets (manually) into one benchmark suite so PyPy doesn't lose anything for Python 2 measurements that they have written and CPython doesn't lose any of its Python 3 benchmarks that it has created.
How does that sound?
+1
FWIW, Cython currently uses both benchmark suites, that of PyPy (in Py2.7) and that of hg.python.org (in Py2.7 and 3.3), but without codespeed integration and also without a dedicated server for benchmark runs. So the results are unfortunately not accurate enough to spot minor changes even over time.
https://sage.math.washington.edu:8091/hudson/view/bench/
We would like to join in on speed.python.org, once it's clear how the benchmarks will be run and how the data uploads work and all that. It already proved a bit tricky to get Cython integrated with the benchmark runner on our side, and I'm planning to rewrite that integration at some point, but it should already be doable to get "something" to work now.
I should also note that we don't currently support the whole benchmark suite, so there must be a way to record individual benchmark results even in the face of failures in other benchmarks. Basically, speed.python.org would be useless for us if a failure in a single benchmark left us without any performance data at all, because it will still take us some time to get to 100% compliance and we would like to know if anything on that road has a performance impact. Currently, we apply a short patch that adds a try-except to the benchmark runner's main loop before starting the measurements, because otherwise it would just bail out completely on a single failure. Oh, and we also patch the benchmarks to remove references to __file__ because of CPython issue 13429, although we may be able to work around that at some point, specifically when doing on-the-fly compilation during imports.
http://bugs.python.org/issue13429
Also note that benchmarks that only test C implemented stdlib modules (re, pickle, json) are useless for Cython because they would only end up timing the exact same code as for plain CPython.
Another test that is useless for us is the "mako" benchmark, because most of what it does is to run generated code. There is currently no way for Cython to hook into that, so we're out of the game here.
We also don't care about program startup tests, obviously, because we know that Cython's compiler overhead plus an optimising gcc run will render them meaningless anyway. I like the fact that there's still an old hg_startup timing result lingering around from the time before I disabled that test, telling us that Cython runs it 99.68% slower than CPython. Got to beat that. 8-)
Stefan
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Am 02.02.2012 09:21, schrieb Stefan Behnel:
Brett Cannon, 01.02.2012 18:25:
to prevent this from either ending up in a dead-end because of this, we need to first decide where the canonical set of Python VM benchmarks are going to live. I say hg.python.org/benchmarks for two reasons. One is that Antoine has already done work there to port some of the benchmarks so there is at least some there that are ready to be run under Python 3 (and the tooling is in place to create separate Python 2 and Python 3 benchmark suites). Two, this can be a test of having the various VM contributors work out of hg.python.org if we are ever going to break the stdlib out for shared development. At worst we can simply take the changes made at pypy/benchmarks that apply to just the unladen benchmarks that exists, and at best merge the two sets (manually) into one benchmark suite so PyPy doesn't lose anything for Python 2 measurements that they have written and CPython doesn't lose any of its Python 3 benchmarks that it has created.
How does that sound? +1
FWIW, Cython currently uses both benchmark suites, that of PyPy (in Py2.7) and that of hg.python.org (in Py2.7 and 3.3), but without codespeed integration and also without a dedicated server for benchmark runs. So the results are unfortunately not accurate enough to spot minor changes even over time.
https://sage.math.washington.edu:8091/hudson/view/bench/
We would like to join in on speed.python.org, once it's clear how the benchmarks will be run and how the data uploads work and all that. It already proved a bit tricky to get Cython integrated with the benchmark runner on our side, and I'm planning to rewrite that integration at some point, but it should already be doable to get "something" to work now.
I support Brett's plan to use the pypy python2 benchmarks and the glue code for codespeed integration, add the python3 compatible benchmarks from hg.python.org so that they do not change the python2 results and host it on hg.python.org. I'd work on merging the repositories.
I'd also help to write a build factory for Cython to integrate it into the buildbot. You can look at the current CPython build factory how the build and upload works currently:
https://bitbucket.org/pypy/buildbot/src/20f86228d582/bot2/pypybuildbot/build...
..Carsten
Carsten Senger - Schumannstr. 38 - 65193 Wiesbaden senger@rehfisch.de - (0611) 5324176 PGP: gpg --recv-keys --keyserver hkp://subkeys.pgp.net 0xE374C75A -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iQEcBAEBAgAGBQJPKk+mAAoJEAOSv+HjdMdaYrwIAIedLSd/XmRSwJTZQCCuDgZt Et+miW95H2qnlys6JymCrdY25l7memlZ4XtpVgoswtND/oJU/Bk3+9aPGZy+djam TG9dUSYdUuPU9qaW8pjRWWoFR3+ChSzmOXmS3oSsaF0ZlH2HKnmOeJGfizzhJyHq 18B1Zb/Jnv1+giVch91f55LID/6XO8+Rtsjo0bD3ZrWnGdSO6e0G2F0krGShXkRs fg1r/FM0Dyk6+8d0Zf+4EVcINfzUnF0b1KefyO5d/bl9DfI6eEal1fiNuSJi7812 ht1kTl6VyL6sh2nenxBICyyo1cCGqflj8EbSVdLhYmyXsoGeMF+4UQHZrwjvg8U= =I0TQ -----END PGP SIGNATURE-----
On Thu, Feb 2, 2012 at 10:21 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Brett Cannon, 01.02.2012 18:25:
to prevent this from either ending up in a dead-end because of this, we need to first decide where the canonical set of Python VM benchmarks are going to live. I say hg.python.org/benchmarks for two reasons. One is that Antoine has already done work there to port some of the benchmarks so there is at least some there that are ready to be run under Python 3 (and the tooling is in place to create separate Python 2 and Python 3 benchmark suites). Two, this can be a test of having the various VM contributors work out of hg.python.org if we are ever going to break the stdlib out for shared development. At worst we can simply take the changes made at pypy/benchmarks that apply to just the unladen benchmarks that exists, and at best merge the two sets (manually) into one benchmark suite so PyPy doesn't lose anything for Python 2 measurements that they have written and CPython doesn't lose any of its Python 3 benchmarks that it has created.
How does that sound?
+1
FWIW, Cython currently uses both benchmark suites, that of PyPy (in Py2.7) and that of hg.python.org (in Py2.7 and 3.3), but without codespeed integration and also without a dedicated server for benchmark runs. So the results are unfortunately not accurate enough to spot minor changes even over time.
https://sage.math.washington.edu:8091/hudson/view/bench/
We would like to join in on speed.python.org, once it's clear how the benchmarks will be run and how the data uploads work and all that. It already proved a bit tricky to get Cython integrated with the benchmark runner on our side, and I'm planning to rewrite that integration at some point, but it should already be doable to get "something" to work now.
Can you come up with a script that does "cython <a python program>"? that would simplify a lot
I should also note that we don't currently support the whole benchmark suite, so there must be a way to record individual benchmark results even in the face of failures in other benchmarks. Basically, speed.python.org would be useless for us if a failure in a single benchmark left us without any performance data at all, because it will still take us some time to get to 100% compliance and we would like to know if anything on that road has a performance impact. Currently, we apply a short patch that adds a try-except to the benchmark runner's main loop before starting the measurements, because otherwise it would just bail out completely on a single failure. Oh, and we also patch the benchmarks to remove references to __file__ because of CPython issue 13429, although we may be able to work around that at some point, specifically when doing on-the-fly compilation during imports.
I think it's fine to mark certain benchmarks not to be runnable under certain platforms. For example it's not like jython will run twisted stuff.
http://bugs.python.org/issue13429
Also note that benchmarks that only test C implemented stdlib modules (re, pickle, json) are useless for Cython because they would only end up timing the exact same code as for plain CPython.
Another test that is useless for us is the "mako" benchmark, because most of what it does is to run generated code. There is currently no way for Cython to hook into that, so we're out of the game here.
Well, if you want cython to be considered python I think this is a pretty crucial feature no?
We also don't care about program startup tests, obviously, because we know that Cython's compiler overhead plus an optimising gcc run will render them meaningless anyway. I like the fact that there's still an old hg_startup timing result lingering around from the time before I disabled that test, telling us that Cython runs it 99.68% slower than CPython. Got to beat that. 8-)
That's probably okish.
participants (3)
-
Carsten Senger
-
Maciej Fijalkowski
-
Stefan Behnel