[Python-Dev] Micro-optimizations by adding special-case bytecodes?

Victor Stinner victor.stinner at gmail.com
Wed Jun 28 10:20:06 EDT 2017


(Victor wears his benchmark hat.)

2017-06-28 15:50 GMT+02:00 Ben Hoyt <benhoyt at gmail.com>:
> ../test_none.sh
> x = 1234; x is None -- 20000000 loops, best of 5: 19.8 nsec per loop
> x = 1234; x is not None -- 10000000 loops, best of 5: 20 nsec per loop
> x = None; x is None -- 10000000 loops, best of 5: 20.7 nsec per loop
> x = None; x is not None -- 10000000 loops, best of 5: 20.8 nsec per loop
> avg 20.3 nsec per loop

Hum, please use perf timeit instead of timeit, it's more reliable. See also:

"How to get reproductible benchmark results"
http://perf.readthedocs.io/en/latest/run_benchmark.html#how-to-get-reproductible-benchmark-results

> [2] Benchmarks comparing master and is_none_bytecode patch (each
> compiled with --enable-optimizations) using python/performance:
>
> +-------------------------+------------+------------------------------+
> | Benchmark               | master_opt | is_none_bytecode_opt         |
> +=========================+============+==============================+
> | 2to3                    | 617 ms     | 541 ms: 1.14x faster (-12%)  |
> +-------------------------+------------+------------------------------+
> | chameleon               | 19.9 ms    | 18.6 ms: 1.07x faster (-7%)  |
> +-------------------------+------------+------------------------------+
> | crypto_pyaes            | 208 ms     | 201 ms: 1.04x faster (-3%)   |
> +-------------------------+------------+------------------------------+
> | deltablue               | 13.8 ms    | 12.9 ms: 1.07x faster (-7%)  |

FYI you can add the -G option to perf compare_to to sort results by
speed (group faster & slower). It gives a more readable table. I also
like using --min-speed=5 to ignore changes smaller than 5%, it reduces
the noise and makes the stable more readable.

Victor


More information about the Python-Dev mailing list