On Sat, 12 Mar 2016 at 10:16 Serhiy Storchaka <storchaka@gmail.com> wrote:
On 07.03.16 19:19, Brett Cannon wrote:
Are you thinking about turning all of this into a benchmark for the benchmark suite?
This was my purpose. I first had written a benchmark for the benchmark suite, then I became interested in more detailed results and a comparison with alternative engines.
There are several questions about a benchmark for the benchmark suite.
- Input data is public 20MB text (8MB in ZIP file). Should we download it every time (may be with caching) or add it to the repository?
Add it the repository probably (du -h
on my checkout says the total disk
space used is 280 MB already). I would like to look into what it would take
to use pip to install dependencies so that we don't have such a large
checkout, at which point we could talk about downloading it. But as of
right now we keep it all self-contained to control for the inputs to the
benchmarks.
- One iteration of all searches on full text takes 29 seconds on my computer. Isn't this too long? In any case I want first optimize some bottlenecks in the re module.
I don't think we have established a "too long" time. We do have some benchmarks like spectral_norm that don't run unless you use rigorous mode and this could be one of them.
- Do we need one benchmark that gives an accumulated time of all searches, or separate microbenchmarks for every pattern?
I don't care either way. Obviously it depends on whether you want to measure overall re perf and have people aim to improve that or let people target specific workload types.
- Would be nice to use the same benchmark for comparing different regular expression. This requires changing perf.py. May be we could use the same interface to compare ElementTree with lxml and json with simplejson.
So there's already an approach to do this when you execute the benchmark scripts directly through command-line flags. You do lose perf.py's calculation benefits, though. I personally have no issue if you or anyone else came up with a way to pass in benchmark-specific flags (i.e., our own version of -X).
- Patterns are ASCII-only and the text is mostly ASCII. Would be nice to add non-ASCII pattern and non-ASCII text. But this will increase run time.
I think that's fine. Better that the benchmark measure something useful than worry about whether anyone will want to run it in fast mode.