[Cython] [cython-users] buffer access to ndarrays as cdef class attributes

Mon Apr 16 20:04:24 CEST 2012

On 16 April 2012 18:56, mark florisson <markflorisson88 at gmail.com> wrote:
> On 16 April 2012 16:40, becker.nils <becker.nils at gmx.net> wrote:
>>
>>> > 1. memoryview assignments inside inner loops are not a good idea.
>>> > although
>>> > no data is being copied, making a new slice involves quite some error
>>> > checking overhead
>>>
>>> Do you mean assignment to the data or to the slice itself?
>>
>> i meant something like
>>
>> cdef float[:] new_view = existing_numpy_array
>>
>> inside a loop. i guess the obvious solution is to do this outside of the
>> loop.
>
> Definitely, that is pretty slow :)
>
>>>
>>> > 2. memoryviews are more general than the ndarray buffer access but they
>>> > are
>>> > not a drop-in replacement, because one cannot return a memoryview object
>>> > as
>>> > a full ndarray without explicit conversion with np.asarray. so to have
>>> > both
>>> > fast access from the C side and a handle on the array on the python
>>> > side,
>>> > requires two local variables: one an ndarray and one a memoryview into
>>> > it.
>>> > (previously the ndarray with buffer access did both of these things)
>>>
>>> Yes, that's correct. That's because you can now slice the memoryviews,
>>> which does not invoke anything on the original buffer object, so when
>>> converting to an object it may be out of sync with the original, which
>>> means you'd have to convert it explicitly.
>>
>> that makes sense.
>>
>>>
>>> We could allow the user to register a conversion function to do this
>>> automatically - only invoked if the slice was re-sliced - (and cache
>>> the results), but it would mean that conversion back from the object
>>> to a memoryview slice would have to validate the buffer again, which
>>> would be more expensive. Maybe that could be mitigated by
>>> special-casing numpy arrays and some other tricks.
>>
>> so for the time being, it seems that the most efficient way of handling this
>> is
>> that cdef functions or any fast C-side manipulation uses only memoryviews,
>> and allocation and communication with python then uses the underlying
>> ndarrays.
>>
>
> Yes, it is best to minimize conversion to and from numpy (which is
> quite expensive either way).
>
>>> > 3. one slow-down that i was not able to avoid is this:
>>> >
>>> >  143:         for i in range(x.shape[0]):
>>> >
>>> >  144:             self.out[i] *= dt * self.M[i]
>>> >
>>> >
>>> >  where all of x, self.out and self.M are memoryviews. in the for-loop,
>>> > cython checks for un-initialized memoryviews like so  (output from
>>> > cython
>>> > -a)
>>> >
>>> >     if (unlikely(!__pyx_v_self->M.memview))
>>> > {PyErr_SetString(PyExc_AttributeError,"Memoryview is not
>>> > initialized");{__pyx_filename = __pyx_f[0]; __pyx_lineno = 144;
>>> > __pyx_clineno = __LINE__; goto __pyx_L1_error;}}
>>> >     __pyx_t_4 = __pyx_v_i;
>>> >
>>> >
>>> > is there a way to tell cython that these views are in fact initialized
>>> > (that's done in __init__ of the class) ?
>>>
>>> You can avoid this by annotating your function with
>>> @cython.initializedcheck(False), or by using a module-global directive
>>> at the top of your file '#cython: initializedcheck=False' (the same
>>> goes for boundscheck and wraparound).
>>
>> ah! helpful! i did not see this on the annotation wiki page.
>> (there is no official documentation on annotations it seems)
>>
>
> Indeed, this should be documented. Documentation for other directives
> can be found here:
> http://docs.cython.org/src/reference/compilation.html?highlight=boundscheck#compiler-directives
> . I'll add documentation for this directive too, currently it only
> works for memoryviews, but maybe it should also work for objects?
>
>>>
>>> These things should be pulled out of loops whenever possible, just
>>> like bounds checking and wrap-around code (if possible). Currently
>>> that is very hard, as even a decref of an object could mean invocation
>>> of a destructor which rebinds or deletes the memoryview (highly
>>> unlikely but possible). To enable optimizations for 99.9% of the use
>>> cases, I think we should be restrictive and allow only explicit
>>> rebinding of memoryview slices in loops themselves, but not in called
>>> functions or destructors. In other words, if a memoryview slice is not
>>> rebound directly in the loop, the compiler is free to create a
>>> temporary outside of the loop and use that everywhere in the loop.
>>
>>
>> or a memoryview context manager which makes the memoryview non-rebindable?
>> "with old_array as cdef float_t[:] new_view:
>>     loop ...
>> "
>> (just fantasizing, probably nonsense). anyway, thanks!
>> nils
>
> We could support final fields and variables, but it would be kind of a
> pain to declare that everywhere.

Maybe final would not be too bad, as you'd only need it for globals
(who uses them anyway) and attributes, but not for local variables.