[py-dev] Performance tests with py.test
Hello, I am trying to wrap performance tests for my module into py.test. Each testcase is, basically, a data preparation stage (can take some time) and the test itself. Is there some way (hooks, plugins) to make py.test print the value returned from the testcase (and make testcases time whatever they want)? What I want is just a table like the one after "py.test -v", but with return values instead of "PASSED". I googled it and haven't found any solutions. At the moment I am doing it as following: 1. Overload "pytest_pyfunc_call()" and save returned result into "pyfuncitem" parameter. 2. Overload "pytest_runtest_makereport()", look for "perf" mark and somehow output "item.result". 3. I haven't decided about the way to output it yet, because I am currently trying to find the place where "PASSED" is printed. But this seems quite clunky and overcomplicated. Is there any better way to do this? If not for the data preparation, the currently existing time measuring system would probably be enough for me (although I'd still miss the ability to print GFLOPS value instead of seconds), but the data preparation is testcase-specific and I do not want to move it to "pytest_generate_tests()" hook. Thank you in advance.
Hi Bogdan, On Sun, Apr 01, 2012 at 02:16 +1100, Bogdan Opanchuk wrote:
I am trying to wrap performance tests for my module into py.test. Each testcase is, basically, a data preparation stage (can take some time) and the test itself. Is there some way (hooks, plugins) to make py.test print the value returned from the testcase (and make testcases time whatever they want)? What I want is just a table like the one after "py.test -v", but with return values instead of "PASSED". I googled it and haven't found any solutions.
I don't know of a direct available example solution.
At the moment I am doing it as following: 1. Overload "pytest_pyfunc_call()" and save returned result into "pyfuncitem" parameter. 2. Overload "pytest_runtest_makereport()", look for "perf" mark and somehow output "item.result". 3. I haven't decided about the way to output it yet, because I am currently trying to find the place where "PASSED" is printed.
do you want to consider performance regressions as a failure?
But this seems quite clunky and overcomplicated. Is there any better way to do this? If not for the data preparation, the currently existing time measuring system would probably be enough for me (although I'd still miss the ability to print GFLOPS value instead of seconds), but the data preparation is testcase-specific and I do not want to move it to "pytest_generate_tests()" hook.
Thank you in advance.
Could you maybe provide a simple example test file and make up some example output that you'd like to see? Meanwhile, if you haven't already you might want to look at the output of "py.test --durations=10" and see about its implementation (mostly contained in _pytest/runner.py, grep for 'duration'). best, holger
Hi Holger, On Sun, Apr 1, 2012 at 12:21 PM, holger krekel <holger@merlinux.eu> wrote:
do you want to consider performance regressions as a failure?
Not really. I just need some table with performance results that I could get for different systems/versions and compare them. Besides, performance regressions can be implemented using existing functionality, because they do not have some continuous result associated with them — only pass/fail.
Could you maybe provide a simple example test file and make up some example output that you'd like to see?
Sure. Consider the following test file: ----- import pytest def test_matrixmul(): pass @pytest.mark.perf('seconds') def test_reduce(): # <some lengthy preparation that I do not want to time> # <actual work> return 1.0 @pytest.mark.perf('GFLOPS') def test_fft(): # <again, some lengthy preparation that I do not want to time> # <actual work> return 1.0, 1e10 ----- Here "test_matrixmul()" is a normal pass/fail test, "test_reduce()" is marked as performance test that returns execution time, and "test_fft()" is marked as a test that returns execution time + the number of operations (thus allowing us to calculate GFLOPS value). I have put together a clunky solution (see the end of this letter) using existing hooks that gives me more or less what I want to see: $ py.test -v ... test_test.py:3: test_matrixmul PASSED test_test.py:6: test_reduce 1.0 s test_test.py:10: test_fft 10.0 GFLOPS ... The only problem here is that I have to explicitly increase verbosity level. I'd prefer 'perf' marked tests show their result even for default verbosity, but I haven't found a way to do it yet.
Meanwhile, if you haven't already you might want to look at the output of "py.test --durations=10" and see about its implementation (mostly contained in _pytest/runner.py, grep for 'duration').
Yes, I know about it, but it is not quite what I need: - it measures the time of the whole testcase, while I usually need to time only specific part - it does not allow me to measure anything more complicated (e.g. GFLOPS, as another variant I may want to see the error value) - it prints its report after all the tests are finished, while it is much more convenient to see testcase result as soon as it is finished (my performance tests may run for quite a long time) So, the solution I have now is shown below. "pytest_pyfunc_call()" implementation annoys me the most, because I had to copy-paste it from python.py, so it exposes some py.test internals and can easily break when something (seemingly hidden) inside the library is changed. ----- def pytest_configure(config): config.pluginmanager.register(PerfPlugin(config), '_perf') class PerfPlugin(object): def __init__(self, config): pass def pytest_pyfunc_call(self, __multicall__, pyfuncitem): # collect testcase return result testfunction = pyfuncitem.obj if pyfuncitem._isyieldedfunction(): res = testfunction(*pyfuncitem._args) else: funcargs = pyfuncitem.funcargs res = testfunction(**funcargs) pyfuncitem.result = res def pytest_report_teststatus(self, __multicall__, report): outcome, letter, msg = __multicall__.execute() # if we have some result attached to the testcase, print it instead of 'PASSED' if hasattr(report, 'result'): msg = report.result return outcome, letter, msg def pytest_runtest_makereport(self, __multicall__, item, call): report = __multicall__.execute() # if the testcase has passed, and has 'perf' marker, process its results if call.when == 'call' and report.passed and hasattr(item.function, 'perf'): perf = item.function.perf perftype = perf.args[0] if perftype == 'seconds': report.result = str(item.result) + " s" else: seconds, operations = item.result report.result = str(operations / seconds / 1e9) + " GFLOPS" return report -----
Hi Holger, For some reason this message has not shown up in the list archives, so I'll resend it again. On Sun, Apr 1, 2012 at 12:21 PM, holger krekel <holger@merlinux.eu> wrote:
do you want to consider performance regressions as a failure?
Not really. I just need some table with performance results that I could get for different systems/versions and compare them. Besides, performance regressions can be implemented using existing functionality, because they do not have some continuous result associated with them — only pass/fail.
Could you maybe provide a simple example test file and make up some example output that you'd like to see?
Sure. Consider the following test file: ----- import pytest def test_matrixmul(): pass @pytest.mark.perf('seconds') def test_reduce(): # <some lengthy preparation that I do not want to time> # <actual work> return 1.0 @pytest.mark.perf('GFLOPS') def test_fft(): # <again, some lengthy preparation that I do not want to time> # <actual work> return 1.0, 1e10 ----- Here "test_matrixmul()" is a normal pass/fail test, "test_reduce()" is marked as performance test that returns execution time, and "test_fft()" is marked as a test that returns execution time + the number of operations (thus allowing us to calculate GFLOPS value). I have put together a clunky solution (see the end of this letter) using existing hooks that gives me more or less what I want to see: $ py.test -v ... test_test.py:3: test_matrixmul PASSED test_test.py:6: test_reduce 1.0 s test_test.py:10: test_fft 10.0 GFLOPS ... The only problem here is that I have to explicitly increase verbosity level. I'd prefer 'perf' marked tests show their result even for default verbosity, but I haven't found a way to do it yet.
Meanwhile, if you haven't already you might want to look at the output of "py.test --durations=10" and see about its implementation (mostly contained in _pytest/runner.py, grep for 'duration').
Yes, I know about it, but it is not quite what I need: - it measures the time of the whole testcase, while I usually need to time only specific part - it does not allow me to measure anything more complicated (e.g. GFLOPS, as another variant I may want to see the error value) - it prints its report after all the tests are finished, while it is much more convenient to see testcase result as soon as it is finished (my performance tests may run for quite a long time) So, the solution I have now is shown below. "pytest_pyfunc_call()" implementation annoys me the most, because I had to copy-paste it from python.py, so it exposes some py.test internals and can easily break when something (seemingly hidden) inside the library is changed. ----- def pytest_configure(config): config.pluginmanager.register(PerfPlugin(config), '_perf') class PerfPlugin(object): def __init__(self, config): pass def pytest_pyfunc_call(self, __multicall__, pyfuncitem): # collect testcase return result testfunction = pyfuncitem.obj if pyfuncitem._isyieldedfunction(): res = testfunction(*pyfuncitem._args) else: funcargs = pyfuncitem.funcargs res = testfunction(**funcargs) pyfuncitem.result = res def pytest_report_teststatus(self, __multicall__, report): outcome, letter, msg = __multicall__.execute() # if we have some result attached to the testcase, print it instead of 'PASSED' if hasattr(report, 'result'): msg = report.result return outcome, letter, msg def pytest_runtest_makereport(self, __multicall__, item, call): report = __multicall__.execute() # if the testcase has passed, and has 'perf' marker, process its results if call.when == 'call' and report.passed and hasattr(item.function, 'perf'): perf = item.function.perf perftype = perf.args[0] if perftype == 'seconds': report.result = str(item.result) + " s" else: seconds, operations = item.result report.result = str(operations / seconds / 1e9) + " GFLOPS" return report -----
Hi Bogdan, sorry for taking a while ... On Thu, May 03, 2012 at 11:18 +1000, Bogdan Opanchuk wrote:
For some reason this message has not shown up in the list archives, so I'll resend it again.
On Sun, Apr 1, 2012 at 12:21 PM, holger krekel <holger@merlinux.eu> wrote:
do you want to consider performance regressions as a failure?
Not really. I just need some table with performance results that I could get for different systems/versions and compare them. Besides, performance regressions can be implemented using existing functionality, because they do not have some continuous result associated with them — only pass/fail.
Could you maybe provide a simple example test file and make up some example output that you'd like to see?
Sure. Consider the following test file:
----- import pytest
def test_matrixmul(): pass
@pytest.mark.perf('seconds') def test_reduce(): # <some lengthy preparation that I do not want to time> # <actual work> return 1.0
@pytest.mark.perf('GFLOPS') def test_fft(): # <again, some lengthy preparation that I do not want to time> # <actual work> return 1.0, 1e10 -----
Here "test_matrixmul()" is a normal pass/fail test, "test_reduce()" is marked as performance test that returns execution time, and "test_fft()" is marked as a test that returns execution time + the number of operations (thus allowing us to calculate GFLOPS value).
I have put together a clunky solution (see the end of this letter) using existing hooks that gives me more or less what I want to see:
$ py.test -v ... test_test.py:3: test_matrixmul PASSED test_test.py:6: test_reduce 1.0 s test_test.py:10: test_fft 10.0 GFLOPS ...
The only problem here is that I have to explicitly increase verbosity level. I'd prefer 'perf' marked tests show their result even for default verbosity, but I haven't found a way to do it yet.
not sure it helps but are you aware that you can put [pytest] addopts = -v in a pytest.ini file?
Meanwhile, if you haven't already you might want to look at the output of "py.test --durations=10" and see about its implementation (mostly contained in _pytest/runner.py, grep for 'duration').
Yes, I know about it, but it is not quite what I need: - it measures the time of the whole testcase, while I usually need to time only specific part
right. It differentiates between setup and runtest phases for a test, though.
- it does not allow me to measure anything more complicated (e.g. GFLOPS, as another variant I may want to see the error value) - it prints its report after all the tests are finished, while it is much more convenient to see testcase result as soon as it is finished (my performance tests may run for quite a long time)
right.
So, the solution I have now is shown below. "pytest_pyfunc_call()" implementation annoys me the most, because I had to copy-paste it from python.py, so it exposes some py.test internals and can easily break when something (seemingly hidden) inside the library is changed.
That's true. If you want to open an issue about this (pytest_pyfunc_call to return test function return value), i can see to care about it.
----- def pytest_configure(config): config.pluginmanager.register(PerfPlugin(config), '_perf')
class PerfPlugin(object):
def __init__(self, config): pass
def pytest_pyfunc_call(self, __multicall__, pyfuncitem): # collect testcase return result testfunction = pyfuncitem.obj if pyfuncitem._isyieldedfunction(): res = testfunction(*pyfuncitem._args) else: funcargs = pyfuncitem.funcargs res = testfunction(**funcargs) pyfuncitem.result = res
def pytest_report_teststatus(self, __multicall__, report): outcome, letter, msg = __multicall__.execute()
# if we have some result attached to the testcase, print it instead of 'PASSED' if hasattr(report, 'result'): msg = report.result
return outcome, letter, msg
def pytest_runtest_makereport(self, __multicall__, item, call): report = __multicall__.execute()
# if the testcase has passed, and has 'perf' marker, process its results if call.when == 'call' and report.passed and hasattr(item.function, 'perf'): perf = item.function.perf perftype = perf.args[0] if perftype == 'seconds': report.result = str(item.result) + " s" else: seconds, operations = item.result report.result = str(operations / seconds / 1e9) + " GFLOPS"
return report -----
We could also think about a convention that would allow setting the short/longform (letter,msg) on the report object so that could also get rid of the report_teststatus hook. I am slightly less inclined to go for this because the above teststatus bit seems nice enough. best, holger
participants (2)
-
Bogdan Opanchuk -
holger krekel