steps to get pypy benchmarks running

I have begun to dive into the performance/perf code. My goal is to get pypy benchmarks running on http://speed.python.org. Since PyPy has a JIT, the benchmark runs must have a warmup stage. There are some first-cut warmup values hardcoded inside a few of the benchmarks. I think I would prefer to use a different mechanism - a separate calibrated data file alongside the performance benchmarks. We could start off with a rough guess for the benchmarks and get the system up and running, and then calibrate the warmups hopefully finding some statistical basis for the values.
Assuming the idea of an external data file is acceptable, I have begun diving into the code, trying to figure out the interaction between the performance package and the perf package. It seems that running "pyperformance run -b telco --warmups 10" does not work, the pyperformance cli runner accepts only a subset of the perf Runner command line options.Shouldn't the performance.cli parse_args() start with the perf._runner.py parser? Would a pull request along those lines be a worthwhile goal?
Thanks, Matti

Hi,
On 13/02/18 14:27, Matti Picus wrote:
I have begun to dive into the performance/perf code. My goal is to get pypy benchmarks running on http://speed.python.org. Since PyPy has a JIT, the benchmark runs must have a warmup stage.
Why? The other interpreters don't get an arbitrary chunk of time for free, so neither should PyPy. Warmup in an inherent cost of dynamic optimisers. The benefits should outweigh the costs, but the costs shouldn't be ignored.
There are some
first-cut warmup values hardcoded inside a few of the benchmarks. I think I would prefer to use a different mechanism - a separate calibrated data file alongside the performance benchmarks. We could start off with a rough guess for the benchmarks and get the system up and running, and then calibrate the warmups hopefully finding some statistical basis for the values.
Assuming the idea of an external data file is acceptable, I have begun diving into the code, trying to figure out the interaction between the performance package and the perf package. It seems that running "pyperformance run -b telco --warmups 10" does not work, the pyperformance cli runner accepts only a subset of the perf Runner command line options.Shouldn't the performance.cli parse_args() start with the perf._runner.py parser? Would a pull request along those lines be a worthwhile goal?
Thanks, Matti Cheers, Mark.

On 13/02/18 21:52, Mark Shannon wrote:
Hi,
On 13/02/18 14:27, Matti Picus wrote:
I have begun to dive into the performance/perf code. My goal is to get pypy benchmarks running on http://speed.python.org. Since PyPy has a JIT, the benchmark runs must have a warmup stage.
Why? The other interpreters don't get an arbitrary chunk of time for free, so neither should PyPy. Warmup in an inherent cost of dynamic optimisers. The benefits should outweigh the costs, but the costs shouldn't be ignored.
Both from startup performance and hot performance are interesting, depending on use-case (for a long-running server, hot performance is interesting; for a short-run script, including warmup is significant). I'd suggest providing both figures for PyPy.
/gsnedders

On 14 February 2018 at 07:52, Mark Shannon <mark@hotpy.org> wrote:
Hi,
On 13/02/18 14:27, Matti Picus wrote:
I have begun to dive into the performance/perf code. My goal is to get pypy benchmarks running on http://speed.python.org. Since PyPy has a JIT, the benchmark runs must have a warmup stage.
Why? The other interpreters don't get an arbitrary chunk of time for free, so neither should PyPy. Warmup in an inherent cost of dynamic optimisers. The benefits should outweigh the costs, but the costs shouldn't be ignored.
For speed.python.org purposes, that would likely be most usefully reported as separate "PyPy (cold)" and "PyPy (warm)" results (where the former runs under the same conditions as CPython, while the latter is given the benefit of warming up the JIT first).
Only reporting the former would miss the point of PyPy's main use case (i.e. long lived processes), while only reporting the latter would miss one of the main answers to "Why hasn't everyone already switched to PyPy for all their Python needs?" (i.e. when the app doesn't run long enough to pay back the increased start-up overhead).
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, Feb 13, 2018 at 2:27 PM, Matti Picus <matti.picus@gmail.com> wrote:
I have begun to dive into the performance/perf code. My goal is to get pypy benchmarks running on http://speed.python.org. Since PyPy has a JIT, the benchmark runs must have a warmup stage. There are some first-cut warmup values hardcoded inside a few of the benchmarks. I think I would prefer to use a different mechanism - a separate calibrated data file alongside the performance benchmarks. We could start off with a rough guess for the benchmarks and get the system up and running, and then calibrate the warmups hopefully finding some statistical basis for the values.
Unfortunately, it is very difficult to determine when warmup has occurred, even on a per-vm-per-benchmark basis. If you allow benchmarks to run for long enough, some will fail to even reach a steady state where warmup can be said to have finished.
We have a more detailed paper on this which might interest you:
http://soft-dev.org/pubs/html/barrett_bolz-tereick_killick_mount_tratt__virt...
and some software which can do something similar to what you propose -- i.e. determine whether and when warmup has occurred:
http://soft-dev.org/src/warmup_stats/
this includes a script which can "diff" benchmarks between (say) commits, or between different versions of a VM. There's an example table on the link above, just hover over the table to see the diff.
If you want to have a go at using this, feel free to get in touch -- we'd be happy to help!
Regards,
Sarah
-- Dr. Sarah Mount, Research Associate, King's College London Fellow of the Software Sustainability Institute twitter: @snim2
participants (5)
-
Geoffrey Sneddon
-
Mark Shannon
-
Matti Picus
-
Nick Coghlan
-
Sarah Mount