On Wed, Feb 27, 2013, at 0:39, Glyph wrote:
On Feb 26, 2013, at 10:05 AM, Peter Westlake
wrote: On Sun, Jan 6, 2013, at 20:22, exarkun@twistedmatrix.com wrote:
On 12:48 am, peter.westlake@pobox.com wrote:
On Fri, Jan 4, 2013, at 19:58, exarkun@twistedmatrix.com wrote: ... Codespeed cannot handle more than one result per benchmark.
The `timeit` module is probably not suitable to use to collect the data ..... What method would you prefer?
Something simple and accurate. :) You may need to do some investigation to determine the best approach.
1. This is simple:
def do_benchmark(content): t1 = time.time() d = flatten(request, content, lambda _: None) t2 = time.time() assert d.called return t2 - t1
Do you think it's acceptably accurate? After a few million iterations, the relative error should be pretty small.
Well it rather depends on the contents of 'content', doesn't it? :)
Yes, sorry, the loop is meant to be around the flatten call! Corrected version below.
I think we have gotten lost in the weeds here. We talked about using benchlib.py initially, and then you noticed a bug, and it was mentioned that benchlib.py was mostly written for testing asynchronous things and didn't have good support for testing the simple case here, which is synchronous rendering of a simple document. However, one of twisted.web.template's major features - arguably its reason for existing in a world that is practically overrun by HTML templating systems - is that it supports Deferreds. So we'll want that anyway.
That's true, and I'll include some Deferreds in the content to be flattened. But if the Deferreds actually do any lengthy processing, it makes a nonsense of the benchmark. It only makes sense to use ones that have already fired, i.e. defer.succeed(...). The other benchmarks are testing asynchronous operations, as names like "ssl_throughput" suggest. Flattening doesn't do any of that, and I'm only trying to measure the speed of flattening.
The right thing to do here would be to update benchlib itself with a few simple tools for doing timing of synchronous tasks, and possibly also to just fix the unbounded-recursion bug that you noticed, not to start building a new, parallel set of testing tools which use different infrastructure. That probably means implementing a small subset of timeit.
I'm not convinced that the unbounded recursion is actually a bug. A callback on a fired Deferred will be executed immediately, and that's correct behaviour. There's no chance to return control to the reactor, and even if there was, anything that happened in that time would only skew the results. The real problem is that recursion-by-Deferred doesn't have the optimisation for tail recursion found in most functional languages, because that would be very difficult and it's not how Deferreds are usually used.
2. For the choice of test data, I had a quick search for benchmarks from other web frameworks. All I found was "hello world" benchmarks, that test the overhead of the framework itself by rendering an empty page. I'll include that, of course.
"hello world" benchmarks have problems because start-up overhead tends to dominate. A realistic web page with some slots and renderers sprinkled throughout would be a lot better. Although even better would be a couple of cases - let's say small, large-sync, and large-async - so we can see if optimizations for one case hurt another.
Yes, I'm just making my excuses for not copying benchmarks from an existing framework.
As Jean-Paul already mentioned in this thread, you can't have more than one result per benchmark, so you'll need to choose a fixed number of configurations and create one benchmark for each.
3. Regarding option parsing, is there any reason to prefer twisted.python.usage.Options over [...]
The reason to prefer usage.Options is consistency. ...
OK
The thing to implement would be a different driver() function that makes a few simple synchronous calls without running the reactor.
If you don't mind the overhead of an extra function call, that could be as simple as: def sync_benchmark(iterations, name, func, *args): t1 = time.time() for _ in range(iterations): func(*args) t2 = time.time() benchlib.benchmark_report(iterations, t2 - t1, name) I'm not sure if options['iterations'] would be the right thing to use here, because it gives the number of times to repeat the whole benchmark, not the number of times round the inner loop. The async code uses options['duration'], but there would be more overhead to run synchronous code for a given duration. Peter.