[Cython] Performance comparison with CPython for attribute/item access
Stefan Behnel
stefan_ml at behnel.de
Sat Feb 16 16:39:44 EST 2019
Hi,
Raymond Hettinger wrote a micro benchmark script for comparing the
performance of basic attribute and item access patterns across Python
versions and build configurations, so I tested the initially committed
version with Cython.
https://github.com/python/cpython/blob/master/Tools/scripts/var_access_benchmark.py
Results are below, comparing Cython (master) to CPython 3.8 (master), and
also disabling all C-time optimisations via the "CYTHON_*" macros (no-opt).
Some things to note:
- Most operations in Cython are around 30-50% faster.
- C-level things like local variables are not measurable in Cython.
- Setting class variables is very slow in both CPython and Cython, probably
for the same (unknown) reason, maybe the method cache or so.
https://bugs.python.org/issue36012
- The dict version check for Python globals is worth it. Disabling it in
Cython with "-DCYTHON_USE_DICT_VERSIONS=0" slows down the lookup by 5x
(2.2ns -> 10ns).
- Disabling PyList optimisations with "-DCYTHON_USE_PYLIST_INTERNALS=0"
slows down the "list_append_pop" benchmark by 5x (21ns -> 102ns).
- The list append/pop optimisations seem to slow down non-lists
unproportionally, for deques by 3x compared to CPython. That seems worth
improving.
Stefan
CPython 3.8 (63fa1cfece)
========================
Variable and attribute read access:
5.4 ns read_local
6.0 ns read_nonlocal
15.7 ns read_global
23.5 ns read_builtin
23.1 ns read_classvar_from_class
20.4 ns read_classvar_from_instance
31.5 ns read_instancevar
25.4 ns read_instancevar_slots
23.8 ns read_namedtuple
34.5 ns read_boundmethod
Variable and attribute write access:
6.2 ns write_local
6.7 ns write_nonlocal
19.1 ns write_global
113.2 ns write_classvar
44.6 ns write_instancevar
33.0 ns write_instancevar_slots
Data structure read access:
23.5 ns read_list
24.0 ns read_deque
25.6 ns read_dict
Data structure write access:
26.0 ns write_list
27.1 ns write_deque
32.0 ns write_dict
Stack (or queue) operations:
61.6 ns list_append_pop
53.9 ns deque_append_pop
Timing loop overhead:
0.4 ns loop_overhead
Cython 3.0a0 (f1eaa9c1f)
========================
Variable and attribute read access:
0.2 ns read_local
0.2 ns read_nonlocal
2.2 ns read_global
0.2 ns read_builtin
13.8 ns read_classvar_from_class
11.1 ns read_classvar_from_instance
21.3 ns read_instancevar
15.5 ns read_instancevar_slots
13.6 ns read_namedtuple
21.5 ns read_boundmethod
Variable and attribute write access:
0.2 ns write_local
0.1 ns write_nonlocal
13.0 ns write_global
92.9 ns write_classvar
29.6 ns write_instancevar
16.1 ns write_instancevar_slots
Data structure read access:
4.0 ns read_list
4.3 ns read_deque
16.5 ns read_dict
Data structure write access:
4.3 ns write_list
6.4 ns write_deque
21.4 ns write_dict
Stack (or queue) operations:
20.7 ns list_append_pop
155.4 ns deque_append_pop
Timing loop overhead:
0.1 ns loop_overhead
Cython 3.0a0 (no-opt)
=====================
Variable and attribute read access:
0.2 ns read_local
0.2 ns read_nonlocal
15.6 ns read_global
0.2 ns read_builtin
16.1 ns read_classvar_from_class
12.1 ns read_classvar_from_instance
21.9 ns read_instancevar
16.3 ns read_instancevar_slots
14.5 ns read_namedtuple
23.8 ns read_boundmethod
Variable and attribute write access:
0.2 ns write_local
0.2 ns write_nonlocal
14.2 ns write_global
99.4 ns write_classvar
35.0 ns write_instancevar
22.4 ns write_instancevar_slots
Data structure read access:
5.7 ns read_list
6.1 ns read_deque
21.1 ns read_dict
Data structure write access:
8.4 ns write_list
8.4 ns write_deque
24.0 ns write_dict
Stack (or queue) operations:
66.4 ns list_append_pop
75.1 ns deque_append_pop
Timing loop overhead:
0.2 ns loop_overhead
More information about the cython-devel
mailing list