[Python-Dev] Micro-optimizations by adding special-case bytecodes?
Victor Stinner
victor.stinner at gmail.com
Wed Jun 28 10:20:06 EDT 2017
(Victor wears his benchmark hat.)
2017-06-28 15:50 GMT+02:00 Ben Hoyt <benhoyt at gmail.com>:
> ../test_none.sh
> x = 1234; x is None -- 20000000 loops, best of 5: 19.8 nsec per loop
> x = 1234; x is not None -- 10000000 loops, best of 5: 20 nsec per loop
> x = None; x is None -- 10000000 loops, best of 5: 20.7 nsec per loop
> x = None; x is not None -- 10000000 loops, best of 5: 20.8 nsec per loop
> avg 20.3 nsec per loop
Hum, please use perf timeit instead of timeit, it's more reliable. See also:
"How to get reproductible benchmark results"
http://perf.readthedocs.io/en/latest/run_benchmark.html#how-to-get-reproductible-benchmark-results
> [2] Benchmarks comparing master and is_none_bytecode patch (each
> compiled with --enable-optimizations) using python/performance:
>
> +-------------------------+------------+------------------------------+
> | Benchmark | master_opt | is_none_bytecode_opt |
> +=========================+============+==============================+
> | 2to3 | 617 ms | 541 ms: 1.14x faster (-12%) |
> +-------------------------+------------+------------------------------+
> | chameleon | 19.9 ms | 18.6 ms: 1.07x faster (-7%) |
> +-------------------------+------------+------------------------------+
> | crypto_pyaes | 208 ms | 201 ms: 1.04x faster (-3%) |
> +-------------------------+------------+------------------------------+
> | deltablue | 13.8 ms | 12.9 ms: 1.07x faster (-7%) |
FYI you can add the -G option to perf compare_to to sort results by
speed (group faster & slower). It gives a more readable table. I also
like using --min-speed=5 to ignore changes smaller than 5%, it reduces
the noise and makes the stable more readable.
Victor
More information about the Python-Dev
mailing list