Re: [Python-Dev] PEP 575 (Unifying function/method classes) update
On 2018-06-18 03:34, INADA Naoki wrote:
Victor had tried to add `tp_fastcall` slot, but he suspended his effort because it's benefit is not enough for it's complexity. https://bugs.python.org/issue29259
I has a quick look at that patch and it's really orthogonal to what I'm proposing. I'm proposing to use the slot *instead* of existing fastcall optimizations. Victor's patch was about adding fastcall support to classes that didn't support it before. Jeroen.
I didn't meant comparing tp_fastcall and your PEP.
I just meant we need to compare complexity and benefit (performance),
and we need reference implementation for comparing.
On Mon, Jun 18, 2018 at 3:03 PM Jeroen Demeyer
On 2018-06-18 03:34, INADA Naoki wrote:
Victor had tried to add `tp_fastcall` slot, but he suspended his effort because it's benefit is not enough for it's complexity. https://bugs.python.org/issue29259
I has a quick look at that patch and it's really orthogonal to what I'm proposing. I'm proposing to use the slot *instead* of existing fastcall optimizations. Victor's patch was about adding fastcall support to classes that didn't support it before.
Jeroen. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
--
INADA Naoki
Hi,
I tried two options to add support for FASTCALL on calling an object:
add a flag in tp_flags and reuse tp_call, or add a new tp_fastcall
slot. I failed to implement correctly any of these two options.
There are multiple issues with tp_fastcall:
* ABI issue: it's possible to load a C extension using the old ABI,
without tp_fastcall: it's not possible to write type->tp_fastcall on
such type. This limitation causes different issues.
* If tp_call is modified, tp_fastcall may be outdated. Same if
tp_fastcall is modified. What happens on "del obj.__call__" or "del
type.__call__"?
* Many public functions of the C API still requires the tuple and dict
to pass positional and keyword arguments, so a compatibility layer is
required to types who only want to implement FASTCALL. Related issue:
what is something calls tp_call with (args: tuple, kwargs: dict)?
Crash or call a compatibility layer converting arguments to FASTCALL
calling convention?
Reusing tp_call for FASTCALL cause similar or worse issues.
I abandoned my idea for two reasons:
1) in the worst case, my changes caused a crash which is not accepted
for an optimization. My first intent was to removed the
property_descr_get() hack because its implementation is fragile and
caused crashes.
2) we implemented a lot of other optimizations which made calls faster
without having to touch tp_call nor tp_fastcall. The benefit of
FASTCALL for tp_call/tp_fastcall was not really significant.
Victor
2018-06-18 7:55 GMT+02:00 Jeroen Demeyer
On 2018-06-18 03:34, INADA Naoki wrote:
Victor had tried to add `tp_fastcall` slot, but he suspended his effort because it's benefit is not enough for it's complexity. https://bugs.python.org/issue29259
I has a quick look at that patch and it's really orthogonal to what I'm proposing. I'm proposing to use the slot *instead* of existing fastcall optimizations. Victor's patch was about adding fastcall support to classes that didn't support it before.
Jeroen. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com
Victor Stinner schrieb am 18.06.2018 um 15:09:
I tried two options to add support for FASTCALL on calling an object: add a flag in tp_flags and reuse tp_call, or add a new tp_fastcall slot. I failed to implement correctly any of these two options.
There are multiple issues with tp_fastcall:
* ABI issue: it's possible to load a C extension using the old ABI, without tp_fastcall: it's not possible to write type->tp_fastcall on such type. This limitation causes different issues.
Not a problem if we rededicate the unused (since Py3.0) "tp_print" slot for it. Even better, since the slot exists already in Py3.0+, tools like Cython, NumPy (with its ufuncs etc.) or generic function dispatchers, basically anything that benefits from fast calls, can enable support for it in all CPython 3.x versions and benefit from faster calls among each other, independent of the support in CPython. The explicit type flag opt-in that the PEP proposes makes this completely safe.
* If tp_call is modified, tp_fastcall may be outdated. Same if tp_fastcall is modified.
Slots are fixed at type creation and should never be modified afterwards.
What happens on "del obj.__call__" or "del type.__call__"?
$ python3.7 -c 'del len.__call__' Traceback (most recent call last): File "<string>", line 1, in <module> AttributeError: 'builtin_function_or_method' object attribute '__call__' is read-only $ python3.7 -c 'del type.__call__' Traceback (most recent call last): File "<string>", line 1, in <module> TypeError: can't set attributes of built-in/extension type 'type' And a really lovely one: $ python3.7 -c 'del (lambda:0).__call__' Traceback (most recent call last): File "<string>", line 1, in <module> AttributeError: __call__
* Many public functions of the C API still requires the tuple and dict to pass positional and keyword arguments, so a compatibility layer is required to types who only want to implement FASTCALL.
Well, yes. It would require a trivial piece of code to map between the two. Fine with me.
Related issue: what is something calls tp_call with (args: tuple, kwargs: dict)? Crash or call a compatibility layer converting arguments to FASTCALL calling convention?
The latter, obviously. Also easy to implement, with the usual undefined dict order caveat (although that's probably solved when running in Py3.6+).
I abandoned my idea for two reasons:
1) in the worst case, my changes caused a crash which is not accepted for an optimization.
This isn't really an optimisation. It's a generalisation of the call protocol.
My first intent was to removed the property_descr_get() hack because its implementation is fragile and caused crashes.
Not sure which hack you mean.
2) we implemented a lot of other optimizations which made calls faster without having to touch tp_call nor tp_fastcall. The benefit of FASTCALL for tp_call/tp_fastcall was not really significant.
What Jeroen said. Cleaning up the implementation and generalising the call protocol is going to open up a wonderfully bright future for CPython. :) Stefan
On Mon, 18 Jun 2018 19:49:28 +0200
Stefan Behnel
Victor Stinner schrieb am 18.06.2018 um 15:09:
I tried two options to add support for FASTCALL on calling an object: add a flag in tp_flags and reuse tp_call, or add a new tp_fastcall slot. I failed to implement correctly any of these two options.
There are multiple issues with tp_fastcall:
* ABI issue: it's possible to load a C extension using the old ABI, without tp_fastcall: it's not possible to write type->tp_fastcall on such type. This limitation causes different issues.
Not a problem if we rededicate the unused (since Py3.0) "tp_print" slot for it.
On the topic of the so-called old ABI (which doesn't really exist), I would like to merge https://github.com/python/cpython/pull/4944 Regards Antoine.
participants (5)
-
Antoine Pitrou
-
INADA Naoki
-
Jeroen Demeyer
-
Stefan Behnel
-
Victor Stinner