unit-profiling, similar to unit-testing
Roy Smith
roy at panix.com
Thu Nov 17 09:03:15 EST 2011
In article <kkuep8-nqd.ln1 at satorlaser.homedns.org>,
Ulrich Eckhardt <ulrich.eckhardt at dominolaser.com> wrote:
> Yes, this is surely something that is necessary, in particular since
> there are no clear success/failure outputs like for unit tests and they
> require a human to interpret them.
As much as possible, you want to automate things so no human
intervention is required.
For example, let's say you have a test which calls foo() and times how
long it takes. You've already mentioned that you run it N times and
compute some basic (min, max, avg, sd) stats. So far, so good.
The next step is to do some kind of regression against past results.
Once you've got a bunch of historical data, it should be possible to
look at today's numbers and detect any significant change in performance.
Much as I loathe the bureaucracy and religious fervor which has grown up
around Six Sigma, it does have some good tools. You might want to look
into control charts (http://en.wikipedia.org/wiki/Control_chart). You
think you've got the test environment under control, do you? Try
plotting a month's worth of run times for a particular test on a control
chart and see what it shows.
Assuming your process really is under control, I would write scripts
that did the following kinds of analysis:
1) For a given test, do a linear regression of run time vs date. If the
line has any significant positive slope, you want to investigate why.
2) You already mentioned, "I would even wonder if you can't verify the
behaviour agains an expected Big-O complexity somehow". Of course you
can. Run your test a bunch of times with different input sizes. I
would try something like a 1-2-5 progression over several decades (i.e.
input sizes of 10, 20, 50, 100, 200, 500, 1000, etc) You will have to
figure out what an appropriate range is, and how to generate useful
input sets. Now, curve fit your performance numbers to various shape
curves and see what correlation coefficient you get.
All that being said, in my experience, nothing beats plotting your data
and looking at it.
More information about the Python-list
mailing list