performance testing recommendations in devguide
The devguide doesn't have anything on performance testing that I could find. We do have a number of relatively useful resources in this space though, like pybench and (eventually) speed.python.org. I'd like to add a page to the devguide on performance testing, including an explanation of our performance goals, how to test for them, and what tools are available. Tools I'm aware of: * pybench (relatively limited in real-world usefulness) * timeit module (for quick comparisions) * benchmarks repo (real-world performance test suite) * speed.python.org (would omit for now) Things to test: * speed * memory (tools? tests?) Critically sensitive performance subjects * interpreter start-up time * module import overhead * attribute lookup overhead (including MRO traversal) * function call overhead * instance creation overhead * dict performance (the underlying namespace type) * tuple performance (packing/unpacking, integral container type) * string performance What would be important to say in the devguide regarding Python performance and testing it? What would you add/subtract from the above? How important is testing memory performance? How do we avoid performance regressions? Thanks! -eric
----------------------------------------
Date: Wed, 29 May 2013 12:00:44 -0600 From: ericsnowcurrently@gmail.com To: python-dev@python.org Subject: [Python-Dev] performance testing recommendations in devguide
The devguide doesn't have anything on performance testing that I could find. We do have a number of relatively useful resources in this space though, like pybench and (eventually) speed.python.org. I'd like to add a page to the devguide on performance testing, including an explanation of our performance goals, how to test for them, and what tools are available.
Thanks Eric! I was looking for that kind of place! ;)
Tools I'm aware of: * pybench (relatively limited in real-world usefulness) * timeit module (for quick comparisions) * benchmarks repo (real-world performance test suite) * speed.python.org (would omit for now)
Why PyBench isn't considered reliable[1]? What do you mean by "benchmarks repo"? http://hg.python.org/benchmarks ?
Things to test: * speed * memory (tools? tests?)
Critically sensitive performance subjects * interpreter start-up time * module import overhead * attribute lookup overhead (including MRO traversal) * function call overhead * instance creation overhead * dict performance (the underlying namespace type) * tuple performance (packing/unpacking, integral container type) * string performance
What would be important to say in the devguide regarding Python performance and testing it?
I've just discovered insertion at the end is faster than at the start of a list. I'd like to see things like that not only in the devguide but also in the docs (http://docs.python.org/). I found it on Dan's presentation[2] but I'm not sure it isn't in the docs somewhere.
What would you add/subtract from the above?
Threading performance!
How important is testing memory performance? How do we avoid performance regressions? Thanks!
Testing and making it faster! ;) Offcourse we need a baseline (benchmarks database) to compare and check improvements.
-eric
[1] "pybench - run the standard Python PyBench benchmark suite. This is considered an unreliable, unrepresentative benchmark; do not base decisions off it. It is included only for completeness." Source: http://hg.python.org/benchmarks/file/dccd52b95a71/README.txt [2] http://stromberg.dnsalias.org/~dstromberg/Intro-to-Python/Intro%20to%20Pytho...
Hi, On Wed, 29 May 2013 21:59:21 +0300 Carlos Nepomuceno <carlosnepomuceno@outlook.com> wrote:
[1] "pybench - run the standard Python PyBench benchmark suite. This is considered an unreliable, unrepresentative benchmark; do not base decisions off it. It is included only for completeness."
"unrepresentative" is the main criticism against pybench. PyBench is a suite of micro-benchmarks (almost nano-benchmarks, actually :-)) that don't try to simulate any real-world situation. PyBench may also be unreliable, because its tests are so static that they could be optimized away by a clever enough (JIT) compiler. Regards Antoine.
On Wed, May 29, 2013 at 9:19 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Hi,
On Wed, 29 May 2013 21:59:21 +0300 Carlos Nepomuceno <carlosnepomuceno@outlook.com> wrote:
[1] "pybench - run the standard Python PyBench benchmark suite. This is considered an unreliable, unrepresentative benchmark; do not base decisions off it. It is included only for completeness."
"unrepresentative" is the main criticism against pybench. PyBench is a suite of micro-benchmarks (almost nano-benchmarks, actually :-)) that don't try to simulate any real-world situation.
PyBench may also be unreliable, because its tests are so static that they could be optimized away by a clever enough (JIT) compiler.
Regards
Antoine.
For what is worth PyBench is bad because it's micro-only. A lot of stuff only shows up in larger examples, especially on an optimizing compiler. The proposed list contains also only micro-benchmarks, which will have the exact same problem as pybench.
On 29.05.2013 21:19, Antoine Pitrou wrote:
Hi,
On Wed, 29 May 2013 21:59:21 +0300 Carlos Nepomuceno <carlosnepomuceno@outlook.com> wrote:
[1] "pybench - run the standard Python PyBench benchmark suite. This is considered an unreliable, unrepresentative benchmark; do not base decisions off it. It is included only for completeness."
"unrepresentative" is the main criticism against pybench. PyBench is a suite of micro-benchmarks (almost nano-benchmarks, actually :-)) that don't try to simulate any real-world situation.
PyBench may also be unreliable, because its tests are so static that they could be optimized away by a clever enough (JIT) compiler.
Correct. pybench was written to test and verify CPython interpreter optimizations and also to detect changes which resulted in performance degradation of very basic operations such as attribute lookups, method calls, simple integer math, etc. It was never meant to be representative of anything :-) At the time, we only had pystone as "benchmark" and things like high precision timers were not yet readily available as they are now. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 29 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2013-07-01: EuroPython 2013, Florence, Italy ... 33 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
Hi, On Wed, 29 May 2013 12:00:44 -0600 Eric Snow <ericsnowcurrently@gmail.com> wrote:
The devguide doesn't have anything on performance testing that I could find.
See http://bugs.python.org/issue17449
Tools I'm aware of: * pybench (relatively limited in real-world usefulness) * timeit module (for quick comparisions) * benchmarks repo (real-world performance test suite) * speed.python.org (would omit for now)
Things to test: * speed * memory (tools? tests?)
You can use the "-m" option to perf.py.
Critically sensitive performance subjects * interpreter start-up time
There are startup tests in the benchmark suite.
* module import overhead * attribute lookup overhead (including MRO traversal) * function call overhead * instance creation overhead * dict performance (the underlying namespace type) * tuple performance (packing/unpacking, integral container type) * string performance
These are all micro-benchmark fodder rather than high-level concerns (e.g. "startup time" is a high-level concern potentially impacted by "module import overhead", but only if the latter is a significant contributor to startup time).
How do we avoid performance regressions?
Right now we don't have any automated way to detect them. Regards Antoine.
29.05.13 21:00, Eric Snow написав(ла):
Critically sensitive performance subjects * interpreter start-up time * module import overhead * attribute lookup overhead (including MRO traversal) * function call overhead * instance creation overhead * dict performance (the underlying namespace type) * tuple performance (packing/unpacking, integral container type) * string performance
* regular expressions performance * IO performance
Hi, On Wed, May 29, 2013 at 9:00 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
...
What would be important to say in the devguide regarding Python performance and testing it?
In the devguide I would only add information that are specific to benchmarking the interpreter. A separate "Benchmarking HOWTO" that covers generic topics could/should be added to docs.python.org. Best Regards, Ezio Melotti
What would you add/subtract from the above? How important is testing memory performance? How do we avoid performance regressions? Thanks!
-eric
participants (7)
-
Antoine Pitrou -
Carlos Nepomuceno -
Eric Snow -
Ezio Melotti -
M.-A. Lemburg -
Maciej Fijalkowski -
Serhiy Storchaka