C API PyObject_Call segfaults with string
Jen Kris
jenkris at tutanota.com
Thu Feb 10 15:04:05 EST 2022
Hi and thanks very much for your comments on reference counting. Since I'm new to the C_API that will help a lot. I know that reference counting is one of the difficult issues with the C API.
I just posted a reply to Inada Naoki showing how I solved the problem I posted yesterday.
Thanks much for your help.
Jen
Feb 9, 2022, 18:43 by python at mrabarnett.plus.com:
> On 2022-02-10 01:37, Jen Kris via Python-list wrote:
>
>> I'm using Python 3.8 so I tried your second choice:
>>
>> pSents = PyObject_CallFunctionObjArgs(pSentMod, pListItem);
>>
>> but pSents is 0x0. pSentMod and pListItem are valid pointers.
>>
> 'PyObject_CallFunction' looks like a good one to use:
>
> """PyObject* PyObject_CallFunction(PyObject *callable, const char *format, ...)
>
> Call a callable Python object callable, with a variable number of C arguments. The C arguments are described using a Py_BuildValue() style format string. The format can be NULL, indicating that no arguments are provided.
> """
>
> [snip]
>
> What I do is add comments to keep track of what objects I have references to at each point and whether they are new references or could be NULL.
>
> For example:
>
> pName = PyUnicode_FromString("nltk.corpus");
> //> pName+?
>
> This means that 'pName' contains a reference, '+' means that it's a new reference, and '?' means that it could be NULL (usually due to an exception, but not always) so I need to check it.
>
> Continuing in this vein:
>
> pModule = PyImport_Import(pName);
> //> pName+? pModule+?
>
> pSubMod = PyObject_GetAttrString(pModule, "gutenberg");
> //> pName+? pModule+? pSubMod+?
> pFidMod = PyObject_GetAttrString(pSubMod, "fileids");
> //> pName+? pModule+? pSubMod+? pFidMod+?
> pSentMod = PyObject_GetAttrString(pSubMod, "sents");
> //> pName+? pModule+? pSubMod+? pFidMod+? pSentMod+?
>
> pFileIds = PyObject_CallObject(pFidMod, 0);
> //> pName+? pModule+? pSubMod+? pFidMod+? pSentMod+? PyObject_CallObject+?
> pListItem = PyList_GetItem(pFileIds, listIndex);
> //> pName+? pModule+? pSubMod+? pFidMod+? pSentMod+? PyObject_CallObject+? pListItem?
> pListStrE = PyUnicode_AsEncodedString(pListItem, "UTF-8", "strict");
> //> pName+? pModule+? pSubMod+? pFidMod+? pSentMod+? PyObject_CallObject+? pListItem? pListStrE+?
>
> As you can see, there's a lot of leaked references building up.
>
> Note how after:
>
> pListItem = PyList_GetItem(pFileIds, listIndex);
>
> the addition is:
>
> //> pListItem?
>
> This means that 'pListItem' contains a borrowed (not new) reference, but could be NULL.
>
> I find it easiest to DECREF as soon as I no longer need the reference and remove a name from the list as soon I no longer need it (and DECREFed where).
>
> For example:
>
> pName = PyUnicode_FromString("nltk.corpus");
> //> pName+?
> if (!pName)
> goto error;
> //> pName+
> pModule = PyImport_Import(pName);
> //> pName+ pModule+?
> Py_DECREF(pName);
> //> pModule+?
> if (!pModule)
> goto error;
> //> pModule+
>
> I find that doing this greatly reduces the chances of getting the reference counting wrong, and I can remove the comments once I've finished the function I'm writing.
> --
> https://mail.python.org/mailman/listinfo/python-list
>
More information about the Python-list
mailing list