Can't get iterator in the C API
Jen Kris
jenkris at tutanota.com
Tue Feb 8 20:12:02 EST 2022
I am using the Python C API to load the Gutenberg corpus from the nltk library and iterate through the sentences. The Python code I am trying to replicate is:
from nltk.corpus import gutenberg
for i, fileid in enumerate(gutenberg.fileids()):
sentences = gutenberg.sents(fileid)
etc
where gutenberg.fileids is, of course, iterable.
I use the following C API code to import the module and get pointers:
int64_t Call_PyModule()
{
PyObject *pModule, *pName, *pSubMod, *pFidMod, *pFidSeqIter,*pSentMod;
pName = PyUnicode_FromString("nltk.corpus");
pModule = PyImport_Import(pName);
if (pModule == 0x0){
PyErr_Print();
return 1; }
pSubMod = PyObject_GetAttrString(pModule, "gutenberg");
pFidMod = PyObject_GetAttrString(pSubMod, "fileids");
pSentMod = PyObject_GetAttrString(pSubMod, "sents");
pFidIter = PyObject_GetIter(pFidMod);
int ckseq_ok = PySeqIter_Check(pFidMod);
pFidSeqIter = PySeqIter_New(pFidMod);
return 0;
}
pSubMod, pFidMod and pSentMod all return valid pointers, but the iterator lines return zero:
pFidIter = PyObject_GetIter(pFidMod);
int ckseq_ok = PySeqIter_Check(pFidMod);
pFidSeqIter = PySeqIter_New(pFidMod);
So the C API thinks gutenberg.fileids is not iterable, but it is. What am I doing wrong?
More information about the Python-list
mailing list