On Sat, 12 Mar 2016 at 10:16 Serhiy Storchaka <storchaka@gmail.com> wrote:
On 07.03.16 19:19, Brett Cannon wrote:
> Are you thinking about turning all of this into a benchmark for the
> benchmark suite?

This was my purpose. I first had written a benchmark for the benchmark
suite, then I became interested in more detailed results and a
comparison with alternative engines.

There are several questions about a benchmark for the benchmark suite.

1. Input data is public 20MB text (8MB in ZIP file). Should we download
it every time (may be with caching) or add it to the repository?

Add it the repository probably (`du -h` on my checkout says the total disk space used is 280 MB already). I would like to look into what it would take to use pip to install dependencies so that we don't have such a large checkout, at which point we could talk about downloading it. But as of right now we keep it all self-contained to control for the inputs to the benchmarks.
 

2. One iteration of all searches on full text takes 29 seconds on my
computer. Isn't this too long? In any case I want first optimize some
bottlenecks in the re module.

I don't think we have established a "too long" time. We do have some benchmarks like spectral_norm that don't run unless you use rigorous mode and this could be one of them.
 

3. Do we need one benchmark that gives an accumulated time of all
searches, or separate microbenchmarks for every pattern?

I don't care either way. Obviously it depends on whether you want to measure overall re perf and have people aim to improve that or let people target specific workload types.
 

4. Would be nice to use the same benchmark for comparing different
regular expression. This requires changing perf.py. May be we could use
the same interface to compare ElementTree with lxml and json with
simplejson.

So there's already an approach to do this when you execute the benchmark scripts directly through command-line flags. You do lose perf.py's calculation benefits, though. I personally have no issue if you or anyone else came up with a way to pass in benchmark-specific flags (i.e., our own version of -X).
 

5. Patterns are ASCII-only and the text is mostly ASCII. Would be nice
to add non-ASCII pattern and non-ASCII text. But this will increase run
time.

I think that's fine. Better that the benchmark measure something useful than worry about whether anyone will want to run it in fast mode.