Re: [Speed] performance 0.5.5 and perf 1.3 released
On Mon, 29 May 2017 18:49:37 +0200 Victor Stinner victor.stinner@gmail.com wrote:
- The
float
benchmark now uses__slots__
on thePoint
class.
So the benchmark numbers are not comparable with previously generated ones?
- Remove the following microbenchmarks. They have been moved to the
pymicrobench <https://github.com/haypo/pymicrobench>
_ project because they are too short, not representative of real applications and are too unstable.
[...]
logging_silent
: values are faster than 1 ns on PyPy with 2^27 loops! (and around 0.7 us on CPython)
The performance of silent logging calls is actually important for all applications which have debug() calls in their critical paths. This is quite common in network and/or distributed programming where you want to allow logging many events for diagnosis of unexpected runtime issues (because many unexpected conditions can appear), but with those logs disabled by default for performance and readability reasons.
This is no more a micro-benchmark than is, say, pickling or JSON encoding; and much less so than solving the N-body problem in pure Python without Numpy...
Update requirements
- Django: 1.11 => 1.11.1
- SQLAlchemy: 1.1.9 => 1.1.10
- certifi: 2017.1.23 => 2017.4.17
- perf: 1.2 => 1.3
- mercurial: 4.1.2 => 4.2
- tornado: 4.4.3 => 4.5.1
Are those requirements for the benchmark runner or for the benchmarks themselves? If the latter, won't updating the requirements make benchmark numbers non-comparable with those generated by previous versions? This is something that the previous benchmarks suite tried to above by using pinned versions of 3rd party libraries.
Regards
Antoine.
Also, to expand a bit on what I'm trying to say: like you, I have my own idea of which benchmarks are pointless and unrepresentative, but when maintaining the former benchmarks suite I usually refrained from removing those benchmarks, out of prudence and respect for the people who had written them (and probably had their reasons for finding those benchmarks useful).
Regards
Antoine.
On Mon, 29 May 2017 19:00:22 +0200 Antoine Pitrou solipsis@pitrou.net wrote:
On Mon, 29 May 2017 18:49:37 +0200 Victor Stinner victor.stinner@gmail.com wrote:
- The
float
benchmark now uses__slots__
on thePoint
class.So the benchmark numbers are not comparable with previously generated ones?
- Remove the following microbenchmarks. They have been moved to the
pymicrobench <https://github.com/haypo/pymicrobench>
_ project because they are too short, not representative of real applications and are too unstable.[...]
logging_silent
: values are faster than 1 ns on PyPy with 2^27 loops! (and around 0.7 us on CPython)The performance of silent logging calls is actually important for all applications which have debug() calls in their critical paths. This is quite common in network and/or distributed programming where you want to allow logging many events for diagnosis of unexpected runtime issues (because many unexpected conditions can appear), but with those logs disabled by default for performance and readability reasons.
This is no more a micro-benchmark than is, say, pickling or JSON encoding; and much less so than solving the N-body problem in pure Python without Numpy...
Update requirements
- Django: 1.11 => 1.11.1
- SQLAlchemy: 1.1.9 => 1.1.10
- certifi: 2017.1.23 => 2017.4.17
- perf: 1.2 => 1.3
- mercurial: 4.1.2 => 4.2
- tornado: 4.4.3 => 4.5.1
Are those requirements for the benchmark runner or for the benchmarks themselves? If the latter, won't updating the requirements make benchmark numbers non-comparable with those generated by previous versions? This is something that the previous benchmarks suite tried to above by using pinned versions of 3rd party libraries.
Regards
Antoine.
2017-05-29 19:10 GMT+02:00 Antoine Pitrou solipsis@pitrou.net:
Also, to expand a bit on what I'm trying to say: like you, I have my own idea of which benchmarks are pointless and unrepresentative, but when maintaining the former benchmarks suite I usually refrained from removing those benchmarks, out of prudence and respect for the people who had written them (and probably had their reasons for finding those benchmarks useful).
I created a pull request to reintroduce the benchmark: https://github.com/python/performance/pull/25
Victor
2017-05-29 19:00 GMT+02:00 Antoine Pitrou solipsis@pitrou.net:
The performance of silent logging calls is actually important for all applications which have debug() calls in their critical paths.
I wasn't sure about that one. The thing is that the performance of many Python functions of the stdlib are important, but my main concern is to get reproductible benchmark results and to use benchmarks which are representative to large applications.
This is quite common in network and/or distributed programming where you want to allow logging many events for diagnosis of unexpected runtime issues (because many unexpected conditions can appear), but with those logs disabled by default for performance and readability reasons.
Note: I would suggest to use a preprocessor or something like to *remove* the calls if performance is critical. It is the solution was chose in a previous company working on embedded devices :-)
This is no more a micro-benchmark than is, say, pickling or JSON encoding; and much less so than solving the N-body problem in pure Python without Numpy...
I'm working step by step. In a perfect world, I would also remove all the benchmarks you listed :-)
I'm in touch with Intel who wants to add new benchmarks more representative of applications, like Django.
Update requirements
- Django: 1.11 => 1.11.1
- SQLAlchemy: 1.1.9 => 1.1.10
- certifi: 2017.1.23 => 2017.4.17
- perf: 1.2 => 1.3
- mercurial: 4.1.2 => 4.2
- tornado: 4.4.3 => 4.5.1
Are those requirements for the benchmark runner or for the benchmarks themselves?
performance creates a virtual env, installs dependencies and run benchmarks in this virtual environment.
If the latter, won't updating the requirements make benchmark numbers non-comparable with those generated by previous versions?
Yes, they are incompatible and "performance compare" raises an error in such case (performance version is stored in JSON files).
This is something that the previous benchmarks suite tried to above by using pinned versions of 3rd party libraries.
Versions (of direct but also indirect dependencies) are pinned in performance/requirements.txt to get reproductible results.
Each performance version is incompatible with the previous one. There is current no backward compatibility warranty. Maybe we should provide a kind of backward compatibility after performance 1.0 release, for example use semantic version.
But I'm not sure that it's really doable. I don't think that it's matter so much to provide backward compatibility. It's very easy to get an old version of performance if you want to compare a new version with old results.
It would be nice to convince PyPy developers to run performance instead of their old benchmark suite. But it seems like there is a technical issue with the number of warmups.
Victor
participants (2)
-
Antoine Pitrou
-
Victor Stinner