[Cython] gsoc: array expressions
robertwb at gmail.com
Tue May 22 08:11:55 CEST 2012
On Mon, May 21, 2012 at 3:34 AM, Dag Sverre Seljebotn
<d.s.seljebotn at astro.uio.no> wrote:
> On 05/20/2012 04:03 PM, mark florisson wrote:
>> For my gsoc we already have some simple initial ideas, i.e.
>> elementwise vector expressions (a + b with a and b arrays with
>> arbitrary rank), I don't think these need any discussion. However,
>> there are a lot of things that haven't been formally discussed on the
>> mailing list, so here goes.
>> Frédéric, I am CCing you since you expressed interest on the numpy
>> mailing list, and I think your insights as a Theano developer can be
>> very helpful in this discussion.
>> User Interface
>> Besides simple array expressions for dense arrays I would like a
>> mechanism for "custom ufuncs", although to a different extent to what
>> Numpy or Numba provide. There are several ways in which we could want
>> them, e.g. as typed functions (cdef, or external C) functions, as
>> lambas or Python functions in the same module, or as general objects
>> (e.g. functions Cython doesn't know about).
>> To achieve maximum efficiency it will likely be good to allow sharing
>> these functions in .pxd files. We have 'cdef inline' functions, but I
>> would prefer annotated def functions where the parameters are
>> specialized on demand, e.g.
>> def add(a, b): # elemental functions can have any number of arguments
>> and operate on any compatible dtype
>> return a + b
>> When calling cdef functions or elemental functions with memoryview
>> arguments, the arguments perform a (broadcasted) elementwise
>> operation. Alternatively, we can have a parallel.elementwise function
>> which maps the function elementwise, which would also work for object
>> callables. I prefer the former, since I think it will read much
>> Secondly, we can have a reduce function (and maybe a scan function),
>> that reduce (respectively scan) in a specified axis or number of axes.
>> parallel.reduce(add, a, b, axis=(0, 2))
>> where the default for axis is "all axes". As for the default value,
>> this could be perhaps optionally provided to the elemental decorator.
>> Otherwise, the reducer will have to get the default values from each
>> dimension that is reduced in, and then skip those values when
>> reducing. (Of course, the reducer function must be associate and
>> commutative). Also, a lambda could be passed in instead of an
> Only associative, right?
> Sounds good to me.
>> elementwise or typed cdef function.
>> Finally, we would have a parallel.nditer/ndenumerate/nditerate
>> function, which would iterate over N memoryviews, and provide a
>> sensible memory access pattern (like numpy.nditer). I'm not sure if it
>> should provide only the indices, or also the values. e.g. an inplace
>> elementwise add would read as follows:
>> for i, j, k in parallel.nditerate(A, B):
>> A[i, j, k] += B[i, j, k]
> I think this sounds good; I guess don't see a particular reason for
> "ndenumerate", I think code like the above is clearer.
I'm assuming the index computations would not be re-done in this case
(i.e. there's more magic going on here than looks like at first
glance)? Otherwise there is an advantage to ndenumerate.
More information about the cython-devel