[Speed] CPython benchmark status, April 2017

Victor Stinner victor.stinner at gmail.com
Thu Apr 6 05:00:39 EDT 2017


Hi,

I'm still working on analyzing past optimizations to guide future
optimizations. I succeeded to identify multiple significant
optimizations over the last 3 years. At least for me, some were
unexpected like "Use the test suite for profile data" which made
pidigts 1.16x faster.

Here is a report of my work of last weeks.


I succeeded to compute benchmarks on CPython master on the period
April, 2014-April,2017: we now have have a timeline over 3 years of
CPython performance!

   https://speed.python.org/timeline/

I started to take notes on significant performance changes (speedup
and slowdown) of this timeline:

   http://pyperformance.readthedocs.io/cpython_results_2017.html

To identify the change which introduced a significant performance
change, I wrote a Python script running a Git bisection: compile
CPython, run benchmark, repeat.

   https://github.com/haypo/misc/blob/master/misc/bisect_cpython_perf.py

It uses a configuration file which looks like:
---
[config]
work_dir = ~/prog/bench_python/bisect-pickle
src_dir = ~/prog/bench_python/master

old_commit = 133138a284be1985ebd9ec9014f1306b9a42
new_commit = 10427f44852b6e872034061421a8890902b8f
benchmark = ~/prog/bench_python/performance/performance/benchmarks/bm_pickle.py
pickle

benchmark_opts = --inherit-environ=PYTHONPATH -p5 -v
configure_args =
---

I succeeded to identify many significant optimizations (TODO: validate
them on the speed-python server), examples:

* PyMem_Malloc() now uses the fast pymalloc allocator
* Add a C implementation of collections.OrderedDict
* Use the test suite for profile data
* Speedup method calls 1.2x
* Added C implementation of functools.lru_cache()
* Optimized ElementTree.iterparse(); it is now 2x faster

perf, performance, server configuration, etc. evolve quicker than
expected, so I created a Git project to keep a copy of JSON files:

   https://github.com/haypo/performance_results

I already lost data of my first miletone (november-december 2016), but
you have data from the second (december 2016-february 2017) and third
(march 2016-today) milestones.

I'm now discussing with PyPy to see how performance could be used to
measure PyPy performance.

Victor


More information about the Speed mailing list