C API PyObject_Call segfaults with string
MRAB
python at mrabarnett.plus.com
Wed Feb 9 21:43:45 EST 2022
On 2022-02-10 01:37, Jen Kris via Python-list wrote:
> I'm using Python 3.8 so I tried your second choice:
>
> pSents = PyObject_CallFunctionObjArgs(pSentMod, pListItem);
>
> but pSents is 0x0. pSentMod and pListItem are valid pointers.
>
'PyObject_CallFunction' looks like a good one to use:
"""PyObject* PyObject_CallFunction(PyObject *callable, const char
*format, ...)
Call a callable Python object callable, with a variable number of C
arguments. The C arguments are described using a Py_BuildValue() style
format string. The format can be NULL, indicating that no arguments are
provided.
"""
[snip]
What I do is add comments to keep track of what objects I have
references to at each point and whether they are new references or could
be NULL.
For example:
pName = PyUnicode_FromString("nltk.corpus");
//> pName+?
This means that 'pName' contains a reference, '+' means that it's a new
reference, and '?' means that it could be NULL (usually due to an
exception, but not always) so I need to check it.
Continuing in this vein:
pModule = PyImport_Import(pName);
//> pName+? pModule+?
pSubMod = PyObject_GetAttrString(pModule, "gutenberg");
//> pName+? pModule+? pSubMod+?
pFidMod = PyObject_GetAttrString(pSubMod, "fileids");
//> pName+? pModule+? pSubMod+? pFidMod+?
pSentMod = PyObject_GetAttrString(pSubMod, "sents");
//> pName+? pModule+? pSubMod+? pFidMod+? pSentMod+?
pFileIds = PyObject_CallObject(pFidMod, 0);
//> pName+? pModule+? pSubMod+? pFidMod+? pSentMod+?
PyObject_CallObject+?
pListItem = PyList_GetItem(pFileIds, listIndex);
//> pName+? pModule+? pSubMod+? pFidMod+? pSentMod+?
PyObject_CallObject+? pListItem?
pListStrE = PyUnicode_AsEncodedString(pListItem, "UTF-8", "strict");
//> pName+? pModule+? pSubMod+? pFidMod+? pSentMod+?
PyObject_CallObject+? pListItem? pListStrE+?
As you can see, there's a lot of leaked references building up.
Note how after:
pListItem = PyList_GetItem(pFileIds, listIndex);
the addition is:
//> pListItem?
This means that 'pListItem' contains a borrowed (not new) reference, but
could be NULL.
I find it easiest to DECREF as soon as I no longer need the reference
and remove a name from the list as soon I no longer need it (and
DECREFed where).
For example:
pName = PyUnicode_FromString("nltk.corpus");
//> pName+?
if (!pName)
goto error;
//> pName+
pModule = PyImport_Import(pName);
//> pName+ pModule+?
Py_DECREF(pName);
//> pModule+?
if (!pModule)
goto error;
//> pModule+
I find that doing this greatly reduces the chances of getting the
reference counting wrong, and I can remove the comments once I've
finished the function I'm writing.
More information about the Python-list
mailing list