[capi-sig] Clarifying tp_new() and tp_init() arguments and PEP-253

Fri Jun 8 13:52:16 CEST 2012

Hi,

I'm reading PEP-253 on Subtyping Built-in Types [1] which by the way has been
extremely helpful to understand details of handling types using Python C API.

In the "Creating a subtype of a built-in type in C" section, the document
includes the following note on the two slots arguments:

"Both tp_new() and tp_init() should receive exactly the same 'args'
and 'kwds' arguments, and both should check that the arguments are
acceptable, because they may be called independently."

I can understand what it says, but I'm unsure about how to interpret it.

Let's consider a simple case of custom type definition,
without subclassing involved.
Does this note recommend to simply repeat the whole args and kwds
parsing and checking in both, tp_new and tp_init?
Like this Noddy example from the Python 3 documentation [2]:

static PyTypeObject noddy_NoddyType = { ... };

int Noddy_init(Noddy* self, PyObject* args, PyObject* kwds);
{
    PyObject *first=NULL, *last=NULL, *tmp;
    static char *kwlist[] = {"first", "last", "number", NULL};
    if (! PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist,
                                      &first, &last,
                                      &self->number))
        return -1;

    /* check args, initialise Noddy members, etc. */
}

Now as per the PEP-253 "both should check that the arguments are acceptable",
so tp_new gets the very same arguments parsing/checking copied & pasted:

PyObject* Noddy_new(PyTypeObject* type, PyObject* args, PyObject* kwds)
{
    PyObject *first=NULL, *last=NULL, *tmp;
    static char *kwlist[] = {"first", "last", "number", NULL};
    if (! PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist,
                                      &first, &last,
                                      &self->number))
        return -1;

    /* Now, what to do with the args?
       Ignore?
       DO or DO NOT repeat the tp_init work of Noddy members initialization?
    */

    Noddy *self;
    self = (Noddy *)type->tp_alloc(type, 0);
    if (self != NULL) {
        self->first = PyUnicode_FromString("");
        if (self->first == NULL)
          {
            Py_DECREF(self);
            return NULL;
          }

        self->last = PyUnicode_FromString("");
        if (self->last == NULL)
          {
            Py_DECREF(self);
            return NULL;
          }

        self->number = 0;
    }

    return (PyObject *)self;
}

Repeating the question from the comment above,
what tp_new is supposed to do with the arguments?
The Noddy example from the manual simply ignores args and kwds,
it does not even perform the checks recommended by the PEP-253.

What the Noddy_new should look like to conform with the PEP-253
recommendations?

The PEP-253 also includes this recommendation related to my question:

"This should first call the base type's tp_new() slot and
then initialize the subtype's additional data members.  To further
initialize the instance, the tp_init() slot is typically called.
Note that the tp_new() slot should *not* call the tp_init() slot;"

In the context of the Noddy example, the "initialize the subtype's additional
data members" means zero-initialisation with empty strings. Makes sense.

Then "To further initialize the instance", tp_init is called of course,
so calling tp_init from tp_new is not advised, this is clear.

So, AFAIU, the PEP-253 suggests to make tp_new check the args and kwds,
report (early) any unacceptable arguments, but not necessarily to
use the values to initialise the instance, as that is tp_init's job.
Does it sound right?

BTW, I understand role of tp_new and tp_init for initialising object would
be different depending if noddy_NoddyType is considered as immutable
object type or not.
But, let's simplify and say noddy_NoddyType is a typical Python object type.

[1] http://www.python.org/dev/peps/pep-0253/
[2] http://docs.python.org/py3k/extending/newtypes.html

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net