nditer: possible to manually handle dimensions with different lengths?
![](https://secure.gravatar.com/avatar/725a50eff7b51f36402b946bf786393e.jpg?s=120&d=mm&r=g)
Using nditer, is it possible to manually handle dimensions with different lengths? For example, lets say I had an array A[5, 100] and I wanted to sample every 10 along the second axis so I would end up with an array B[5,10]. Is it possible to do this with nditer, handling the iteration over the second axis manually of course (probably in cython)? I want something like this (modified from http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html#putting-the-inn... ) @cython.boundscheck(False) def sum_squares_cy(arr): cdef np.ndarray[double] x cdef np.ndarray[double] y cdef int size cdef double value cdef int j axeslist = list(arr.shape) axeslist[1] = -1 out = zeros((arr.shape[0], 10)) it = np.nditer([arr, out], flags=['reduce_ok', 'external_loop', 'buffered', 'delay_bufalloc'], op_flags=[['readonly'], ['readwrite', 'no_broadcast']], op_axes=[None, axeslist], op_dtypes=['float64', 'float64']) it.operands[1][...] = 0 it.reset() for xarr, yarr in it: x = xarr y = yarr size = x.shape[0] j = 0 for i in range(size): #some magic here involving indexing into x[i] and y[j] return it.operands[1] Does this make sense? Is it possible to do?
![](https://secure.gravatar.com/avatar/72902e7adf1c8f5b524c04a15cc3c6a5.jpg?s=120&d=mm&r=g)
On Fri, Sep 30, 2011 at 8:03 AM, John Salvatier <jsalvati@u.washington.edu>wrote:
I'm not sure I understand precisely what you're asking. Maybe you could reshape A to have shape [5, 10, 10], so that one of those 10's can match up with the 10 in B, perhaps with the op_axes? -Mark
![](https://secure.gravatar.com/avatar/725a50eff7b51f36402b946bf786393e.jpg?s=120&d=mm&r=g)
I apologize, I picked a poor example of what I want to do. Your suggestion would work for the example I provided, but not for a more complex example. My actual task is something like a "group by" operation along a particular axis (with a known number of groups). Let me try again: What I would like to be able to do is to specify some of the iterator dimensions to be handled manually by me. For example lets say I have some kind of a 2d smoothing algorithm. If I start with an array of shape [a,b,c,d] and I'd like to do the 2d smoothing over the 2nd and 3rd dimensions, I'd like to be able to tell nditer to do normal broadcasting and iteration over the 1st and 4th dimensions but leave iteration over the 2nd and 3rd dimensions to me and my algorithm. Each iteration of nditer would give me a 2d array to which I apply my algorithm. This way I could write more arbitrary functions that operate on arrays and support broadcasting. Is clearer? On Fri, Sep 30, 2011 at 5:04 PM, Mark Wiebe <mwwiebe@gmail.com> wrote:
![](https://secure.gravatar.com/avatar/72902e7adf1c8f5b524c04a15cc3c6a5.jpg?s=120&d=mm&r=g)
On Sat, Oct 1, 2011 at 1:45 PM, John Salvatier <jsalvati@u.washington.edu>wrote:
Maybe this will work for you: In [15]: a = np.arange(2*3*4*5).reshape(2,3,4,5) In [16]: it0, it1 = np.nested_iters(a, [[0,3], [1,2]], flags=['multi_index']) In [17]: for x in it0: ....: print it1.itviews[0] ....: [[ 0 5 10 15] [20 25 30 35] [40 45 50 55]] [[ 1 6 11 16] [21 26 31 36] [41 46 51 56]] [[ 2 7 12 17] [22 27 32 37] [42 47 52 57]] [[ 3 8 13 18] [23 28 33 38] [43 48 53 58]] [[ 4 9 14 19] [24 29 34 39] [44 49 54 59]] [[ 60 65 70 75] [ 80 85 90 95] [100 105 110 115]] [[ 61 66 71 76] [ 81 86 91 96] [101 106 111 116]] [[ 62 67 72 77] [ 82 87 92 97] [102 107 112 117]] [[ 63 68 73 78] [ 83 88 93 98] [103 108 113 118]] [[ 64 69 74 79] [ 84 89 94 99] [104 109 114 119]] Cheers, Mark
![](https://secure.gravatar.com/avatar/725a50eff7b51f36402b946bf786393e.jpg?s=120&d=mm&r=g)
Thanks mark! I think that's exactly what I'm looking for. We even had a previous discussion about this (oops!) ( http://mail.scipy.org/pipermail/numpy-discussion/2011-January/054421.html). I didn't find any documentation, I will try to add some once I understand how it works better. John On Sat, Oct 1, 2011 at 2:53 PM, Mark Wiebe <mwwiebe@gmail.com> wrote:
![](https://secure.gravatar.com/avatar/725a50eff7b51f36402b946bf786393e.jpg?s=120&d=mm&r=g)
Some observations and questions about nested_iters. Nested_iters seems to require that all input arrays have the same number of dimensions (so you will have to pad some input shapes with 1s). Is there a way to specify how the axes line are matched together like for nditer? When I try to run the following program, @cython.boundscheck(False) def vars(vals, group, axis ): cdef np.ndarray[double, ndim = 2] values cdef np.ndarray[long long, ndim = 2] groups cdef np.ndarray[double, ndim = 2] outs cdef int size cdef double value cdef int i, j cdef long long cgroup cdef double min cdef double max cdef double open oshape = list(vals.shape) bins = len(np.unique(group)) oshape = oshape+[bins] oshape[axis] = 1 out = np.empty(tuple(oshape)) axes = range(vals.ndim) axes.remove(axis) gshape = [1] * len(oshape) gshape[axis] = len(group) group.shape = gshape vals = vals[...,np.newaxis] it0, it1 = np.nested_iters([vals,group, out], [axes, [axis,len(oshape) -1]], op_dtypes=['float64', 'int64', 'float64'], flags = ['multi_index', 'buffered']) size = vals.shape[axis] for x in it0: values, groups, outs = it1.itviews j = -1 for i in range(size): if cgroup != groups[i,0]: if j != -1: outs[0,j] = garmanklass(open, values[i,0], min, max) cgroup = groups[i,0] min = inf max = -inf open = values[i,0] j += 1 min = fmin(min, values[i,0]) max = fmax(max, values[i,0]) outs[0,j+1] = garmanklass(open, values[size -1], min, max) return out I get an error File "comp.pyx", line 58, in varscale.comp.vars (varscale\comp.c:1565) values, groups, outs = it1.itviews ValueError: cannot provide an iterator view when buffering is enabled Which I am not sure how to deal with. Any advice? What I am trying to do here is to do a "grouped" calculation (the group specified by the group argument) on the values along the given axis. I try to use nested_iter to iterate over the specified axis and a new axis (the length of the number of groups) separately so I can do my calculation. On Mon, Oct 3, 2011 at 9:03 AM, John Salvatier <jsalvati@u.washington.edu>wrote:
![](https://secure.gravatar.com/avatar/725a50eff7b51f36402b946bf786393e.jpg?s=120&d=mm&r=g)
I ended up fixing my problem by removing the 'buffering' flag and adding the 'copy' flag to each of the input arrays. I think that nested_iters might be improved by an operand axes specification for each layer of nesting like nditer uses, though I suppose that 3 layers of nesting might be confusing for users. I get an "array too big error" on "values, groups, outs = it1.itviews" when the shape of the iterator is larger than ~(4728, 125285) even if each of the arrays should only have actual size along one dimension. Code: @cython.boundscheck(False) def groupwise_accumulate(vals, group, axis ): cdef np.ndarray[double, ndim = 2] values cdef np.ndarray[long, ndim = 2] groups cdef np.ndarray[double, ndim = 2] outs cdef int size cdef long g cdef int i, j #copy so that swaping the axis doesn't mess up the original arrays vals = vals.copy() group = group.copy() #add a dimension to match up with the new dimension and swap the given axis to the end vals.shape = vals.shape + (1,) vaxes = range(vals.ndim) vaxes.append(axis) vaxes.remove(axis) vals = np.transpose(vals, vaxes) vals = vals.copy() #the output should have the same shape as the values except along the #last two axes (which are the given axis and the new axis) oshape = list(vals.shape) bins = len(np.unique(group)) oshape[-1] = 1 oshape[-2] = bins out = np.empty(tuple(oshape)) #line up grouping with the given axis gshape = [1] * (len(oshape) - 1) + [vals.shape[-1]] group.shape = gshape #nested iterator should go along the last two axes axes = range(vals.ndim) axes0 = axes[:-2] axes1 = axes[-2:] it0, it1 = np.nested_iters([vals,group, out], [axes0, axes1], op_dtypes=['float64', 'int32', 'float64'], op_flags = [['readonly', 'copy'], ['readonly','copy'], ['readwrite']], flags = ['multi_index', 'reduce_ok' ]) size = vals.shape[-1] for x in it0: values, groups, outs = it1.itviews i = 0 j = 0 while i < size: g = groups[0,i] #accumulation initialization while i < size and groups[0,i] == g: #groupwise accumulation i += 1 outs[j,0] = calculation() j += 1 #swap back the new axis to the original location of the given axis out.shape = out.shape[:-1] oaxes = range(vals.ndim -1) oaxes.insert(axis, out.ndim-1) oaxes = oaxes[:-1] #remove the now reduced original given axis out = np.transpose(out, oaxes) return out On Mon, Oct 3, 2011 at 2:03 PM, John Salvatier <jsalvati@u.washington.edu>wrote:
![](https://secure.gravatar.com/avatar/72902e7adf1c8f5b524c04a15cc3c6a5.jpg?s=120&d=mm&r=g)
On Fri, Sep 30, 2011 at 8:03 AM, John Salvatier <jsalvati@u.washington.edu>wrote:
I'm not sure I understand precisely what you're asking. Maybe you could reshape A to have shape [5, 10, 10], so that one of those 10's can match up with the 10 in B, perhaps with the op_axes? -Mark
![](https://secure.gravatar.com/avatar/725a50eff7b51f36402b946bf786393e.jpg?s=120&d=mm&r=g)
I apologize, I picked a poor example of what I want to do. Your suggestion would work for the example I provided, but not for a more complex example. My actual task is something like a "group by" operation along a particular axis (with a known number of groups). Let me try again: What I would like to be able to do is to specify some of the iterator dimensions to be handled manually by me. For example lets say I have some kind of a 2d smoothing algorithm. If I start with an array of shape [a,b,c,d] and I'd like to do the 2d smoothing over the 2nd and 3rd dimensions, I'd like to be able to tell nditer to do normal broadcasting and iteration over the 1st and 4th dimensions but leave iteration over the 2nd and 3rd dimensions to me and my algorithm. Each iteration of nditer would give me a 2d array to which I apply my algorithm. This way I could write more arbitrary functions that operate on arrays and support broadcasting. Is clearer? On Fri, Sep 30, 2011 at 5:04 PM, Mark Wiebe <mwwiebe@gmail.com> wrote:
![](https://secure.gravatar.com/avatar/72902e7adf1c8f5b524c04a15cc3c6a5.jpg?s=120&d=mm&r=g)
On Sat, Oct 1, 2011 at 1:45 PM, John Salvatier <jsalvati@u.washington.edu>wrote:
Maybe this will work for you: In [15]: a = np.arange(2*3*4*5).reshape(2,3,4,5) In [16]: it0, it1 = np.nested_iters(a, [[0,3], [1,2]], flags=['multi_index']) In [17]: for x in it0: ....: print it1.itviews[0] ....: [[ 0 5 10 15] [20 25 30 35] [40 45 50 55]] [[ 1 6 11 16] [21 26 31 36] [41 46 51 56]] [[ 2 7 12 17] [22 27 32 37] [42 47 52 57]] [[ 3 8 13 18] [23 28 33 38] [43 48 53 58]] [[ 4 9 14 19] [24 29 34 39] [44 49 54 59]] [[ 60 65 70 75] [ 80 85 90 95] [100 105 110 115]] [[ 61 66 71 76] [ 81 86 91 96] [101 106 111 116]] [[ 62 67 72 77] [ 82 87 92 97] [102 107 112 117]] [[ 63 68 73 78] [ 83 88 93 98] [103 108 113 118]] [[ 64 69 74 79] [ 84 89 94 99] [104 109 114 119]] Cheers, Mark
![](https://secure.gravatar.com/avatar/725a50eff7b51f36402b946bf786393e.jpg?s=120&d=mm&r=g)
Thanks mark! I think that's exactly what I'm looking for. We even had a previous discussion about this (oops!) ( http://mail.scipy.org/pipermail/numpy-discussion/2011-January/054421.html). I didn't find any documentation, I will try to add some once I understand how it works better. John On Sat, Oct 1, 2011 at 2:53 PM, Mark Wiebe <mwwiebe@gmail.com> wrote:
![](https://secure.gravatar.com/avatar/725a50eff7b51f36402b946bf786393e.jpg?s=120&d=mm&r=g)
Some observations and questions about nested_iters. Nested_iters seems to require that all input arrays have the same number of dimensions (so you will have to pad some input shapes with 1s). Is there a way to specify how the axes line are matched together like for nditer? When I try to run the following program, @cython.boundscheck(False) def vars(vals, group, axis ): cdef np.ndarray[double, ndim = 2] values cdef np.ndarray[long long, ndim = 2] groups cdef np.ndarray[double, ndim = 2] outs cdef int size cdef double value cdef int i, j cdef long long cgroup cdef double min cdef double max cdef double open oshape = list(vals.shape) bins = len(np.unique(group)) oshape = oshape+[bins] oshape[axis] = 1 out = np.empty(tuple(oshape)) axes = range(vals.ndim) axes.remove(axis) gshape = [1] * len(oshape) gshape[axis] = len(group) group.shape = gshape vals = vals[...,np.newaxis] it0, it1 = np.nested_iters([vals,group, out], [axes, [axis,len(oshape) -1]], op_dtypes=['float64', 'int64', 'float64'], flags = ['multi_index', 'buffered']) size = vals.shape[axis] for x in it0: values, groups, outs = it1.itviews j = -1 for i in range(size): if cgroup != groups[i,0]: if j != -1: outs[0,j] = garmanklass(open, values[i,0], min, max) cgroup = groups[i,0] min = inf max = -inf open = values[i,0] j += 1 min = fmin(min, values[i,0]) max = fmax(max, values[i,0]) outs[0,j+1] = garmanklass(open, values[size -1], min, max) return out I get an error File "comp.pyx", line 58, in varscale.comp.vars (varscale\comp.c:1565) values, groups, outs = it1.itviews ValueError: cannot provide an iterator view when buffering is enabled Which I am not sure how to deal with. Any advice? What I am trying to do here is to do a "grouped" calculation (the group specified by the group argument) on the values along the given axis. I try to use nested_iter to iterate over the specified axis and a new axis (the length of the number of groups) separately so I can do my calculation. On Mon, Oct 3, 2011 at 9:03 AM, John Salvatier <jsalvati@u.washington.edu>wrote:
![](https://secure.gravatar.com/avatar/725a50eff7b51f36402b946bf786393e.jpg?s=120&d=mm&r=g)
I ended up fixing my problem by removing the 'buffering' flag and adding the 'copy' flag to each of the input arrays. I think that nested_iters might be improved by an operand axes specification for each layer of nesting like nditer uses, though I suppose that 3 layers of nesting might be confusing for users. I get an "array too big error" on "values, groups, outs = it1.itviews" when the shape of the iterator is larger than ~(4728, 125285) even if each of the arrays should only have actual size along one dimension. Code: @cython.boundscheck(False) def groupwise_accumulate(vals, group, axis ): cdef np.ndarray[double, ndim = 2] values cdef np.ndarray[long, ndim = 2] groups cdef np.ndarray[double, ndim = 2] outs cdef int size cdef long g cdef int i, j #copy so that swaping the axis doesn't mess up the original arrays vals = vals.copy() group = group.copy() #add a dimension to match up with the new dimension and swap the given axis to the end vals.shape = vals.shape + (1,) vaxes = range(vals.ndim) vaxes.append(axis) vaxes.remove(axis) vals = np.transpose(vals, vaxes) vals = vals.copy() #the output should have the same shape as the values except along the #last two axes (which are the given axis and the new axis) oshape = list(vals.shape) bins = len(np.unique(group)) oshape[-1] = 1 oshape[-2] = bins out = np.empty(tuple(oshape)) #line up grouping with the given axis gshape = [1] * (len(oshape) - 1) + [vals.shape[-1]] group.shape = gshape #nested iterator should go along the last two axes axes = range(vals.ndim) axes0 = axes[:-2] axes1 = axes[-2:] it0, it1 = np.nested_iters([vals,group, out], [axes0, axes1], op_dtypes=['float64', 'int32', 'float64'], op_flags = [['readonly', 'copy'], ['readonly','copy'], ['readwrite']], flags = ['multi_index', 'reduce_ok' ]) size = vals.shape[-1] for x in it0: values, groups, outs = it1.itviews i = 0 j = 0 while i < size: g = groups[0,i] #accumulation initialization while i < size and groups[0,i] == g: #groupwise accumulation i += 1 outs[j,0] = calculation() j += 1 #swap back the new axis to the original location of the given axis out.shape = out.shape[:-1] oaxes = range(vals.ndim -1) oaxes.insert(axis, out.ndim-1) oaxes = oaxes[:-1] #remove the now reduced original given axis out = np.transpose(out, oaxes) return out On Mon, Oct 3, 2011 at 2:03 PM, John Salvatier <jsalvati@u.washington.edu>wrote:
participants (2)
-
John Salvatier
-
Mark Wiebe