On 06.03.16 11:30, Maciej Fijalkowski wrote:
> this is really difficult to read, can you tell me which column am I looking at?
The first column is the searched pattern. The second column is the
number of found matches (for control, it should be the same with all
engines and versions). The third column, under the "re" header is the
time in milliseconds. The column under the "str.find" header is the time
of searching without using regular expressions.
PyPy 2.2 usually is significantly faster than CPython 2.7, except
searching plain string with regular expression. But thanks to Flexible
String Representation searching plain string with and without regular
expression is faster on CPython 3.6.
Members of this list who are in the UK during April may be interested in
this free workshop in central London. If you have any questions please
feel free to email me directly.
Best Practices in Software Benchmarking 2016 (#bench16)
Wednesday April 20 2016
King's College London
For computer scientists and software engineers, benchmarking (evaluating the
running time of a piece of software, or the performance of a piece of hardware)
is a common method for evaluating new techniques. However, there is little
agreement on how benchmarking should be carried out, how to control for
confounding variables, how to analyse latency data, or how to aid the
repeatability of experiments. This free workshop will be a venue for computer
scientists and research software engineers to discuss their current best
practices and future directions.
For further information and free registration please visit:
Jan Vitek (Northeastern University)
Joe Parker (The Jodrell Laboratory, Royal Botanic Gardens)
Simon Taylor (University of Lancaster)
Tomas Kalibera (Northeastern University)
James Davenport (University of Bath)
Edd Barrett (King's College London)
Jeremy Bennett (Embecosm)
Sarah Mount & Laurence Tratt (King's College London)
On Sun, 13 Mar 2016 17:44:10 +0000
Brett Cannon <brett(a)python.org> wrote:
> > 2. One iteration of all searches on full text takes 29 seconds on my
> > computer. Isn't this too long? In any case I want first optimize some
> > bottlenecks in the re module.
> I don't think we have established a "too long" time. We do have some
> benchmarks like spectral_norm that don't run unless you use rigorous mode
> and this could be one of them.
> > 3. Do we need one benchmark that gives an accumulated time of all
> > searches, or separate microbenchmarks for every pattern?
> I don't care either way. Obviously it depends on whether you want to
> measure overall re perf and have people aim to improve that or let people
> target specific workload types.
This is a more general latent issue with our current benchmarking
philosophy. We have built something which aims to be a general-purpose
benchmark suite, but in some domains a more comprehensive set of
benchmarks may be desirable. Obviously we don't want to have 10 JSON
benchmarks, 10 re benchmarks, 10 I/O benchmarks, etc. in the default
benchmarks run, so what do we do for such cases? Do we tell people
domain-specific benchmarks should be developed independently? Do we
include some facilities to create such subsuites without them being
part of the default bunch?
(note a couple domain-specific benchmarks -- iobench, stringbench, etc.
-- are currently maintained separately)
On 07.03.16 19:19, Brett Cannon wrote:
> Are you thinking about turning all of this into a benchmark for the
> benchmark suite?
This was my purpose. I first had written a benchmark for the benchmark
suite, then I became interested in more detailed results and a
comparison with alternative engines.
There are several questions about a benchmark for the benchmark suite.
1. Input data is public 20MB text (8MB in ZIP file). Should we download
it every time (may be with caching) or add it to the repository?
2. One iteration of all searches on full text takes 29 seconds on my
computer. Isn't this too long? In any case I want first optimize some
bottlenecks in the re module.
3. Do we need one benchmark that gives an accumulated time of all
searches, or separate microbenchmarks for every pattern?
4. Would be nice to use the same benchmark for comparing different
regular expression. This requires changing perf.py. May be we could use
the same interface to compare ElementTree with lxml and json with
5. Patterns are ASCII-only and the text is mostly ASCII. Would be nice
to add non-ASCII pattern and non-ASCII text. But this will increase run