[Speed] Latest enhancements of perf 0.8.1 and performance 0.3.1
Victor Stinner
victor.stinner at gmail.com
Wed Nov 2 11:53:45 EDT 2016
2016-11-02 15:20 GMT+01:00 Armin Rigo <armin.rigo at gmail.com>:
> Is that really the kind of examples you want to put forward?
I am not a big fan of timeit, but we must use it sometimes to
micro-optimizations in CPython to check if an optimize really makes
CPython faster or not. I am only trying to enhance timeit.
Understanding results require to understand how the statements are
executed.
> This example means "compare CPython where the data cache gets extra pressure from reading a strangely large code object,
I wrote --duplicate option to benchmark "x+y" with "x=1; y=2". I know,
it's an extreme and stupid benchmark, but many people spend a lot of
time on trying to optimize this in Python/ceval.c:
https://bugs.python.org/issue21955
I tried multiple values of --duplicate when benchmarking x+y, and x+y
seems "faster" when using a larger --duplicate value. I understand
that the cost of the outer loop is higher than the cost of "reading a
strangely large code object".
I provide a tool and I try to document how to use it. But it's hard to
prevent users to use it for stupid things.
For example, recently I spent time trying to optimize bytes%args in
Python 3 after reading an article, but then I realized that the Python
2 benchmark was meaningless:
https://atleastfornow.net/blog/not-all-bytes/
def bytes_plus():
b"hi" + b" " + b"there"
... benchmark(bytes_plus) ...
bytes_plus() is optimized by the _compiler_, so the benchmark measure
the cost of LOAD_CONST :-)
The issue was not the tool but the usage of the tool :-D
Victor
More information about the Speed
mailing list