Hi Mark! See my more general reply; here I'll just tie loose ends with a few +1s.
On 4/14/19 7:30 AM, Mark Shannon wrote:
On 10/04/2019 5:25 pm, Petr Viktorin wrote:
PEP 590 is built on a simple idea, formalizing fastcall. But it is complicated by PY_VECTORCALL_ARGUMENTS_OFFSET and Py_TPFLAGS_METHOD_DESCRIPTOR. As far as I understand, both are there to avoid intermediate bound-method object for LOAD_METHOD/CALL_METHOD. (They do try to be general, but I don't see any other use case.) Is that right?
Not quite. Py_TPFLAGS_METHOD_DESCRIPTOR is for LOAD_METHOD/CALL_METHOD, it allows any callable descriptor to benefit from the LOAD_METHOD/CALL_METHOD optimisation.
PY_VECTORCALL_ARGUMENTS_OFFSET exists so that callables that make onward calls with an additional argument can do so efficiently. The obvious example is bound-methods, but classes are at least as important. cls(*args) -> cls.new(cls, *args) -> cls.__init__(self, *args)
I see. Thanks!
(I'm running out of time today, but I'll write more on why I'm asking, and on the case I called "impossible" (while avoiding creation of a "bound method" object), later.)
Let me drop this thread; I stand corrected.
Another point I'd like some discussion on is that vectorcall function pointer is per-instance. It looks this is only useful for type objects, but it will add a pointer to every new-style callable object (including functions). That seems wasteful. Why not have a per-type pointer, and for types that need it (like PyTypeObject), make it dispatch to an instance-specific function?
Firstly, each callable has different behaviour, so it makes sense to be able to do the dispatch from caller to callee in one step. Having a per-object function pointer allows that. Secondly, callables are either large or transient. If large, then the extra few bytes makes little difference. If transient then, it matters even less. The total increase in memory is likely to be only a few tens of kilobytes, even for a large program.
That makes sense.