[New-bugs-announce] [issue29358] Add tp_fastnew and tp_fastinit to PyTypeObject, 15-20% faster object instanciation

STINNER Victor report at bugs.python.org
Tue Jan 24 03:25:29 EST 2017


New submission from STINNER Victor:

After #29259 "Add tp_fastcall to PyTypeObject: support FASTCALL calling convention for all callable objects", the two last slots which still use the (args: tuple, kwargs: dict) calling convention are tp_new and tp_init, two major slots to instanciate objects.

I implemented tp_fastnew/tp_fastinit on top of the issue #29259 pull request (tp_fastcall). The implementation is a WIP, it's just complete enough to start benchmarking the code.

Example of benchmarks on the two types currently optimized in my WIP fast_init branch, list and _asyncio.Future:
---
haypo at smithers$ ./python -m perf timeit -s 'List=list' 'List()' --duplicate=100 --compare-to=../default-ref/python 
Median +- std dev: [ref] 81.9 ns +- 0.2 ns -> [fast_init] 69.3 ns +- 0.4 ns: 1.18x faster (-15%)

haypo at smithers$ ./python -m perf timeit -s 'List=list' 'List((1,2,3))' --duplicate=100 --compare-to=../default-ref/python 
Median +- std dev: [ref] 137 ns +- 6 ns -> [fast_init] 107 ns +- 0 ns: 1.28x faster (-22%)

haypo at smithers$ ./python -m perf timeit -s 'import _asyncio, asyncio; Future=_asyncio.Future; loop=asyncio.get_event_loop()' 'Future(loop=loop)' --compare-to=../default-ref/python
Median +- std dev: [ref] 411 ns +- 20 ns -> [fast_init] 355 ns +- 18 ns: 1.16x faster (-14%)
---

The speedup of tp_fastnew + tp_fastinit is between 1.16x faster and 1.28x faster. The question is now if it is worth it.

Warning: The code is not fully optimized and is likely to have subtle bugs. The pull request is not ready for a review, but you may take a look if you would like to have an idea of the required changes. The most tricky changes are made in typeobject.c to support backward compatibility (call tp_new if tp_fastnew is not set) and stable API (support Python 3.6 PyTypeObject without the two slots).

Note: tp_fastnew and tp_fastinit slots are the expected end of my large FASTCALL optimization project.

----------
components: Interpreter Core
messages: 286151
nosy: haypo, inada.naoki, rhettinger, serhiy.storchaka, yselivanov
priority: normal
pull_requests: 23
severity: normal
status: open
title: Add tp_fastnew and tp_fastinit to PyTypeObject, 15-20% faster object instanciation
type: performance
versions: Python 3.7

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue29358>
_______________________________________


More information about the New-bugs-announce mailing list