[capi-sig]Call protocol: tuple/dict subclasses?
Dear C API lovers,
I have a question about the usual args/kwargs call protocol (as used by PyObject_Call and tp_call): is the "args" tuple supposed to be of type exactly tuple or are tuple subclasses allowed? Same question for "kwargs" and dict.
The documentation is not clear about this.
I do know that calls from Python are done with an exact tuple and exact dict, but in C this is not so clear. For example, the implementation of _PyErr_CreateException uses PyTuple_Check(), not PyTuple_CheckExact():
static PyObject* _PyErr_CreateException(PyObject *exception, PyObject *value) { if (value == NULL || value == Py_None) { return _PyObject_CallNoArg(exception); } else if (PyTuple_Check(value)) { return PyObject_Call(exception, value, NULL); } else { return PyObject_CallFunctionObjArgs(exception, value, NULL); } }
Also, the implementation of the call protocol has assertions of both forms: you see both assert(PyTuple_Check(args)) and assert(PyTuple_CheckExact(args)) and similar for dict. It seems pretty random whether or not an exact type check is done.
I'm pretty sure that there are code paths that would result in an assertion failure because of this.
So basically my question is: what is the right behavior? Given the implementation of the Python bytecode interpreter, I would be inclined to say that type checks should be exact.
I'm noticing this because I'm working on an implementation of PEP 590.
Jeroen.
On 5/9/19 10:08 AM, Jeroen Demeyer wrote:
Dear C API lovers,
I have a question about the usual args/kwargs call protocol (as used by PyObject_Call and tp_call): is the "args" tuple supposed to be of type exactly tuple or are tuple subclasses allowed? Same question for "kwargs" and dict.
The documentation is not clear about this.
I do know that calls from Python are done with an exact tuple and exact dict, but in C this is not so clear. For example, the implementation of _PyErr_CreateException uses PyTuple_Check(), not PyTuple_CheckExact():
static PyObject* _PyErr_CreateException(PyObject *exception, PyObject *value) { if (value == NULL || value == Py_None) { return _PyObject_CallNoArg(exception); } else if (PyTuple_Check(value)) { return PyObject_Call(exception, value, NULL); } else { return PyObject_CallFunctionObjArgs(exception, value, NULL); } }
Also, the implementation of the call protocol has assertions of both forms: you see both assert(PyTuple_Check(args)) and assert(PyTuple_CheckExact(args)) and similar for dict. It seems pretty random whether or not an exact type check is done.
I'm pretty sure that there are code paths that would result in an assertion failure because of this.
So basically my question is: what is the right behavior? Given the implementation of the Python bytecode interpreter, I would be inclined to say that type checks should be exact.
I don't see the reason for them to be exact. The reason we need tuples/dicts (and not any sequence/mapping) is we access the structs directly, but that will work with subtypes.
If we one day decide clean this up, converting from *_CheckExact() to *_Check() will not break previously working code.
On 2019-05-09 17:43, Petr Viktorin wrote:
I don't see the reason for them to be exact. The reason we need tuples/dicts (and not any sequence/mapping) is we access the structs directly, but that will work with subtypes.
Fair enough, that's reasonable. I'll use the _Check() variants instead of _CheckExact() in the PEP 590 implementation.
On Fri, May 10, 2019 at 12:09 AM Jeroen Demeyer <J.Demeyer@ugent.be> wrote:
Dear C API lovers,
I have a question about the usual args/kwargs call protocol (as used by PyObject_Call and tp_call): is the "args" tuple supposed to be of type exactly tuple or are tuple subclasses allowed? Same question for "kwargs" and dict.
[ munch ]
I'm writing a native extension module which is also affected by this. I' wrapping a few C functions which take a length argument and a pointer argument. The Python wrapper just takes a sequence argument and, for when the 3rd Python argument is the sequence, extracts the C length argument with
count = PySequenceLength(PyTuple_GetItem(args, 2));
(For those concerned about code safety, PyArg_ParseTuple has already checked that the third argument exists and is a sequence, and the method is flagged as positional arguments only.)
I'm using GetItem because it's a borrowed reference and I don't need to adjust any reference counts.
So, my extension module will break if the args tuple is not compatible with the C API for the builtin tuple type.
This is related I think to Victor's work on the levels of C API compatibility. Is there a requirement that a subclass of a builtin type must implement the entire C API?
(I'd like to say yes, but this isn't production code so it wouldn't be big problem if the consensus is no.)
Also, the implementation of the call protocol has assertions of both forms: you see both assert(PyTuple_Check(args)) and assert(PyTuple_CheckExact(args)) and similar for dict. It seems pretty random whether or not an exact type check is done.
I'm pretty sure that there are code paths that would result in an assertion failure because of this.
So basically my question is: what is the right behavior? Given the implementation of the Python bytecode interpreter, I would be inclined to say that type checks should be exact.
I agree for the current implementation. The internals of the Python interpreter can be much more limited about what they accept.
--
cheers,
Hugh Fisher
On 09May2019 2120, Hugh Fisher wrote:
This is related I think to Victor's work on the levels of C API compatibility. Is there a requirement that a subclass of a builtin type must implement the entire C API?
It's not related to the compatibility work, but a subclass of a builtin type *is* the builtin type with some Python attributes/methods added. Otherwise it won't report as being a subclass (both *_Check and *_CheckExact will fail).
What you lose by using the concrete API is the ability to use those Python methods. For example, given:
class MyDict(dict): def __getitem__(self, key): return super()[key.lower()]
Calling this with PyObject_GetItem(...) will resolve the Python __getitem__ and call it. However, using PyDict_GetItem(...) will not, as it goes directly to the C structure (because it's actually the default implementation of dict.__getitem__, so it "should" only be called when it has not been overridden, and therefore does not have to check for overrides).
So if you have a subclass with the same semantics for core operations, it'll be fine. But if the caller is expecting that subclass to behave differently based on overrides, it won't.
Cheers, Steve
participants (4)
-
Hugh Fisher
-
Jeroen Demeyer
-
Petr Viktorin
-
Steve Dower