[Cython] Multidimensional indexing of C++ objects

Stefan Behnel stefan_ml at behnel.de
Wed Jul 8 19:01:53 CEST 2015


Ian Henriksen schrieb am 08.07.2015 um 03:50:
> On Sat, Jul 4, 2015 at 12:43 AM Stefan Behnel wrote:
>> Ian Henriksen schrieb am 04.07.2015 um 00:43:
>>> I'm a GSOC student working to make a Cython API for DyND. DyND
>>> <https://github.com/libdynd/libdynd> is a relatively new n-dimensional
>>> array library in C++ that is based on NumPy. A full set of Python
>> bindings
>>> (created using Cython) are provided as a separate package. The goal of my
>>> project is to make it so that DyND arrays can be used easily within
>> Cython
>>> so that an n-dimensional array object can be used without any of the
>>> corresponding Python overhead.
>>>
>>> Currently, there isn't a good way to assign to multidimensional slices
>>> within Cython. Since the indexing operator in C++ is limited to a single
>>> argument, we use the call operator to represent multidimensional
>> indexing,
>>> and then use a proxy class to perform assignment to a slice.
>>> Currently, in C++, assigning to a slice along the second axis of a DyND
>>> array looks like this:
>>>
>>> a(irange(), 1).vals() = 0;
>>>
>>> Unfortunately, in Cython, only the index operator can be used for
>>> assignment, so following the C++ syntax isn't currently possible. Does
>>> anyone know of a good way to address this?
>>
>> Just an idea, don't know how feasible this is, but we could allow inline
>> special methods in C++ class declarations that implement Python protocols.
>> Example:
>>
>>     cdef extern from ...:
>>         cppclass Array2D:
>>            int operator[] except +
>>            int getItemAt(ssize_t x, ssize_t y) except +
>>
>>            cdef inline __getitem__(self, Py_ssize_t x, Py_ssize_t y):
>>                return self.getItemAt(x, y)
>>
>>     def test():
>>         cdef Array2D a
>>         return a[1, 2]
>>
>> Cython could then translate an item access on an Array2D instance into the
>> corresponding special "method" call.
>>
>> Drawbacks:
>>
>> 1) The example above would conflict with the C++ [] operator, so it would
>> be ambiguous which one is being used in Cython code. Not sure if there's a
>> use case for making both available to Cython code, but that would be
>> difficult to achieve if the need arises.
>>
>> 2) It doesn't solve the general problem of assigning to C++ expressions,
>> especially because it does not extend the syntax allowed by Cython which
>> would still limit what you can do in these fake special methods.
>>
>> Regarding your proposals, I'd be happy if we could avoid adding syntax
>> support for assigning to function calls. And I agree that the cname
>> assignment hack is really just a big hack. It shouldn't be relied on.
> 
> Yes, both this idea and the modified version that redefines operator[] are
> similar to the idea I had about respecting the cname entries for
> operator[]. This method would certainly expose a more flexible API for
> modules that want to do this. It may work in my case, but I worry that
> getting this into Cython would further complicate the (already lengthy)
> indexing logic.

The main problem with the logic in IndexNode is that it predates the
infrastructure change that allows node replacements in the analyse_types()
methods. It should eventually be split into separate nodes that do
different things, e.g. integer indexing into C arrays, Python object item
access, C++ operator[] usage, buffer/memory view indexing, memory view
slicing, you name it.

In any case, adding new functionality can now be done by creating a new
node rather than complicating the type analysis code. And any further
refactoring would be warmly appreciated. :)


> I'm still uneasy about exporting an API that is
> fundamentally different from the existing Python and C++ APIs, but making a
> way to use Python's syntax could help with that. Is there a good way to
> make a method like this accept Python-like indexing syntax? It would be
> confusing to put a code definition like this inside an extern block too.
> Could this syntax be adapted to work outside the extern block while still
> showing its connection to the original cppclass?

The feature of providing inline functions in .pxd files already exists, as
does the feature of adding functionality to external extension types by
implementing special methods in their declaration. See, for example, the
buffer protocol support for old NumPy arrays that we implemented in
numpy/__init__pxd (look for "__getbuffer__") or the helper functions in
cpython/array.pxd.

Allowing to override __getitem__() in an extern C++ class declaration would
really only be one step further. The question is whether __getitem__() is
the right abstraction to use here as it also only accepts a single argument
as input. That would be a tuple in Python for multi-dimensional lookup. It
would be nice if the index arguments (e.g. x,y,z for 3 dimensions) could be
explicit in the method signature instead, potentially using default
arguments if less dimensions should be allowed.

Stefan



More information about the cython-devel mailing list