I like tagged pointers because I expect that we can start to experiment it as soon as all C extensons only use an opaque API. Moreover, I expect better performance (yeah, I'm optimistic, it helps ;-))
My notes: https://pythoncapi.readthedocs.io/optimization_ideas.html#tagged-pointers-do...
Le mar. 5 mars 2019 à 08:06, Stefan Behnel <python_capi@behnel.de> a écrit :
I could imagine (on 64 bits) to reserve, say, a 'signed' 24 bits for a refcount and the rest for an index into an array of object/vtable pointer structs. All negative refcounts would have special meanings, such as
I'm not convinced that it's efficient. Compared to the current CPython implementation (PyObject* pointing to PyObject), it adds yet another indirection.
Instead of using 2 indirections for all data, I would prefer to put directly the content into the PyHandle/opaque "PyObject*" to avoid *zero* indirection. I expect that it's more efficient for CPU caches.
We can imagine to store small int, latin1 strings, maybe some singletons like Non?, and maybe also some floats, directly inside a 64-bit PyHandle integer. Python memory allocators are aligned to 8 bytes. We have 3 free bits to store data. One bit is enough to distinguish tagged pointers and regular PyObject*.
Things like these, can't say which are reasonable and/or fast enough.
Neil already implemented the idea and ran some benchmarks :-) https://mail.python.org/archives/list/capi-sig@python.org/thread/EGAY55ZWMF2...
""" The result looks promising: ./python -m perf timeit --name='x+y' -s 'x=10000; y=2' 'x+y' --dup 1000 -v -o int.json ./python -m perf timeit --name='x+y' -s 'x=fixedint(10000); y=fixedint(2)' 'x+y' --dup 1000 -v -o fixedint.json ./python -m perf compare_to int.json fixedint.json Mean +- std dev: [int] 32.3 ns +- 1.0 ns -> [fixedint] 10.8 ns +- 0.3 ns: 3.00x faster (-67%) """
maybe also things like flat tuples, where the items are directly stored consecutively in the object array following the tuple index.
If you put values directly inside PyHandle/opaque PyObject*, the "PyObject** ob_item" array of a PyTuple suddently becomes very efficient in term of memory footprint ;-) (same for list, dict, etc.)
Night gathers, and now my watch begins. It shall not end until my death.