[Numpy-discussion] Using the C-API iterator with object arrays

Jaime Fernández del Río jaime.frio at gmail.com
Mon Feb 12 07:56:21 EST 2018


On Mon, Feb 12, 2018 at 12:13 PM Eugen Wintersberger <
eugen.wintersberger at gmail.com> wrote:

> Hi there,
> I have a question concerning the numpy iterator C-API. I want to create a
> numpy
> string array using NPY_OBJECT as a datatype for creating the array (the
> reason I am going for this
> approach is that I do not know the length of the individual strings at the
> time I construct
> the array, I only know its shape). The original strings are stored
> in a std::vector<char*> instance. The approach I took was something like
> this
>
>     std::vector<char*> buffer = ....;
>     NpyIter *iter = NpyIter_New(numpy_array,
>                                 NPY_ITER_READWRITE | NPY_ITER_C_INDEX | NPY_ITER_REFS_OK,
>                                 NPY_CORDER , NPY_NO_CASTING,nullptr);
>     if(iter==NULL)
>     {
>       return;
>     }
>     NpyIter_IterNextFunc *iternext = NpyIter_GetIterNext(iter,nullptr);
>     if(iternext == NULL)
>     {
>       std::cerr<<"Could not instantiate next iterator function"<<std::endl;
>       return;
>     }
>     PyObject **dataptr = (PyObject**)NpyIter_GetDataPtrArray(iter);
>     for(auto string: buffer)
>     {
>       dataptr[0] = PyString_FromSting(string); // this string construction seem to work
>       iternext(iter);
>     }
>     NpyIter_Deallocate(iter);
>
>
> This code snippet is a bit stripped down with all the safety checks
> removed to make
> things more readable.
> However, the array I get back still contains only a None instance. Does
> anyone have an idea
> what I am doing wrong here?
>

I think you have the indirections wrong in dataptr?

NpyIter_GetDataPtrArray returns a char**, that hold the address of the
variable where the iterator holds the address of the first byte of the
current item being iterated.

When you write to dataptr[0] you are not writing to the array, but to where
the iterator holds the address of the first byte of the current item.

So you would have to write to dataptr[0][0], or **dataptr, to actually
affect the contents of the array. Of course, your dataptr being a PyObject**,
the compiler would probably complaint about such an assignment.

I think that if you define dataptr as a PyObject*** (yay, three star
programming <http://wiki.c2.com/?ThreeStarProgrammer>!) and then assign to
**dataptr, everything will work fine. If that doesn't work, maybe try to
make dataptr a char**, and then assign to (PyObject *)(**dataptr) = ...

Jaime



> Thanks in advance.
>
> best regards
>    Eugen
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>


-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180212/bc4d1120/attachment.html>


More information about the NumPy-Discussion mailing list