I ended up fixing my problem by removing the 'buffering' flag and adding the 'copy' flag to each of the input arrays.

I think that nested_iters might be improved by an operand axes specification for each layer of nesting like nditer uses, though I suppose that 3 layers of nesting might be confusing for users.

I get an "array too big error" on "values, groups, outs = it1.itviews" when the shape of the iterator is larger than ~(4728, 125285) even if each of the arrays should only have actual size along one dimension.

Code:

@cython.boundscheck(False)
def groupwise_accumulate(vals, group, axis ):
cdef np.ndarray[double, ndim = 2] values
cdef np.ndarray[long, ndim = 2] groups
cdef np.ndarray[double, ndim = 2] outs
cdef int size
cdef long g
cdef int i, j
#copy so that swaping the axis doesn't mess up the original arrays
vals = vals.copy()
group = group.copy()
#add a dimension to match up with the new dimension and swap the given axis to the end
vals.shape = vals.shape + (1,)
vaxes = range(vals.ndim)
vaxes.append(axis)
vaxes.remove(axis)
vals = np.transpose(vals, vaxes)
vals = vals.copy()

#the output should have the same shape as the values except along the 
#last two axes (which are the given axis and the new axis)
oshape = list(vals.shape)
bins = len(np.unique(group))
oshape[-1] = 1
oshape[-2] = bins
out = np.empty(tuple(oshape))
#line up grouping with the given axis
gshape = [1] * (len(oshape) - 1) + [vals.shape[-1]]
group.shape = gshape
#nested iterator should go along the last two axes
axes = range(vals.ndim)
axes0 = axes[:-2]
axes1 = axes[-2:]
it0, it1 = np.nested_iters([vals,group, out], 
                    [axes0, axes1],
                    op_dtypes=['float64', 'int32', 'float64'],
                    op_flags = [['readonly', 'copy'], ['readonly','copy'], ['readwrite']],
                    flags = ['multi_index', 'reduce_ok' ])
size = vals.shape[-1]
for x in it0:
values, groups, outs = it1.itviews

i = 0
j = 0
while i < size:
g = groups[0,i]
#accumulation initialization
while i < size and groups[0,i] == g:
#groupwise accumulation 
i += 1
outs[j,0] = calculation()
j += 1
#swap back the new axis to the original location of the given axis
out.shape = out.shape[:-1]
oaxes = range(vals.ndim -1)
oaxes.insert(axis, out.ndim-1)
oaxes = oaxes[:-1] #remove the now reduced original given axis
out = np.transpose(out, oaxes)
return out

On Mon, Oct 3, 2011 at 2:03 PM, John Salvatier <jsalvati@u.washington.edu> wrote:
Some observations and questions about nested_iters. Nested_iters seems to require that all input arrays have the same number of dimensions (so you will have to pad some input shapes with 1s). Is there a way to specify how the axes line are matched together like for nditer? 

When I try to run the following program,

@cython.boundscheck(False)
def vars(vals, group, axis ):
cdef np.ndarray[double, ndim = 2] values
cdef np.ndarray[long long, ndim = 2] groups
cdef np.ndarray[double, ndim = 2] outs
cdef int size
cdef double value
cdef int i, j
cdef long long cgroup
cdef double min
cdef double max
cdef double open
oshape = list(vals.shape)
bins = len(np.unique(group))
oshape = oshape+[bins]
oshape[axis] = 1
out = np.empty(tuple(oshape))
axes = range(vals.ndim)
axes.remove(axis)
gshape = [1] * len(oshape)
gshape[axis] = len(group)
group.shape = gshape
vals = vals[...,np.newaxis]
it0, it1 = np.nested_iters([vals,group, out], 
[axes, [axis,len(oshape) -1]],
op_dtypes=['float64', 'int64', 'float64'],
flags = ['multi_index', 'buffered'])
size = vals.shape[axis]
for x in it0:
values, groups, outs = it1.itviews

j = -1
for i in range(size):
if cgroup != groups[i,0]:
if j != -1: 
outs[0,j] = garmanklass(open, values[i,0], min, max)
cgroup = groups[i,0]
min = inf
max = -inf
open = values[i,0]
j += 1

min = fmin(min, values[i,0])
max = fmax(max, values[i,0])

outs[0,j+1] = garmanklass(open, values[size -1], min, max)
return out

I get an error

  File "comp.pyx", line 58, in varscale.comp.vars (varscale\comp.c:1565)
    values, groups, outs = it1.itviews
ValueError: cannot provide an iterator view when buffering is enabled

Which I am not sure how to deal with. Any advice?

What I am trying to do here is to do a "grouped" calculation (the group specified by the group argument) on the values along the given axis. I try to use nested_iter to iterate over the specified axis and a new axis (the length of the number of groups) separately so I can do my calculation.

On Mon, Oct 3, 2011 at 9:03 AM, John Salvatier <jsalvati@u.washington.edu> wrote:
Thanks mark! I think that's exactly what I'm looking for. We even had a previous discussion about this (oops!) (http://mail.scipy.org/pipermail/numpy-discussion/2011-January/054421.html).

I didn't find any documentation, I will try to add some once I understand how it works better.

John


On Sat, Oct 1, 2011 at 2:53 PM, Mark Wiebe <mwwiebe@gmail.com> wrote:
On Sat, Oct 1, 2011 at 1:45 PM, John Salvatier <jsalvati@u.washington.edu> wrote:
I apologize, I picked a poor example of what I want to do. Your suggestion would work for the example I provided, but not for a more complex example. My actual task is something like a "group by" operation along a particular axis (with a known number of groups). 

Let me try again: What I would like to be able to do is to specify some of the iterator dimensions to be handled manually by me. For example lets say I have some kind of a 2d smoothing algorithm. If I start with an array of shape [a,b,c,d] and I'd like to do the 2d smoothing over the 2nd and 3rd dimensions, I'd like to be able to tell nditer to do normal broadcasting and iteration over the 1st and 4th dimensions but leave iteration over the 2nd and 3rd dimensions to me and my algorithm. Each iteration of nditer would give me a 2d array to which I apply my algorithm. This way I could write more arbitrary functions that operate on arrays and support broadcasting.

Is clearer?

Maybe this will work for you:

In [15]: a = np.arange(2*3*4*5).reshape(2,3,4,5)

In [16]: it0, it1 = np.nested_iters(a, [[0,3], [1,2]], flags=['multi_index'])

In [17]: for x in it0:
   ....:     print it1.itviews[0]
   ....:     
[[ 0  5 10 15]
 [20 25 30 35]
 [40 45 50 55]]
[[ 1  6 11 16]
 [21 26 31 36]
 [41 46 51 56]]
[[ 2  7 12 17]
 [22 27 32 37]
 [42 47 52 57]]
[[ 3  8 13 18]
 [23 28 33 38]
 [43 48 53 58]]
[[ 4  9 14 19]
 [24 29 34 39]
 [44 49 54 59]]
[[ 60  65  70  75]
 [ 80  85  90  95]
 [100 105 110 115]]
[[ 61  66  71  76]
 [ 81  86  91  96]
 [101 106 111 116]]
[[ 62  67  72  77]
 [ 82  87  92  97]
 [102 107 112 117]]
[[ 63  68  73  78]
 [ 83  88  93  98]
 [103 108 113 118]]
[[ 64  69  74  79]
 [ 84  89  94  99]
 [104 109 114 119]]

Cheers,
Mark


 


On Fri, Sep 30, 2011 at 5:04 PM, Mark Wiebe <mwwiebe@gmail.com> wrote:
On Fri, Sep 30, 2011 at 8:03 AM, John Salvatier <jsalvati@u.washington.edu> wrote:
Using nditer, is it possible to manually handle dimensions  with different lengths? 

For example, lets say I had an array A[5, 100] and I wanted to sample every 10 along the second axis so I would end up with an array B[5,10]. Is it possible to do this with nditer, handling the iteration over the second axis manually of course (probably in cython)?


@cython.boundscheck(False)
def sum_squares_cy(arr):
    cdef np.ndarray[double] x
    cdef np.ndarray[double] y
    cdef int size
    cdef double value
    cdef int j

    axeslist = list(arr.shape)
    axeslist[1] = -1

    out = zeros((arr.shape[0], 10))
    it = np.nditer([arr, out], flags=['reduce_ok', 'external_loop',
                                      'buffered', 'delay_bufalloc'],
                op_flags=[['readonly'], ['readwrite', 'no_broadcast']],
                op_axes=[None, axeslist],
                op_dtypes=['float64', 'float64'])
    it.operands[1][...] = 0
    it.reset()
    for xarr, yarr in it:
        x = xarr
        y = yarr
        size = x.shape[0]
        j = 0
        for i in range(size):
           #some magic here involving indexing into x[i] and y[j]
    return it.operands[1]

Does this make sense? Is it possible to do?

I'm not sure I understand precisely what you're asking. Maybe you could reshape A to have shape [5, 10, 10], so that one of those 10's can match up with the 10 in B, perhaps with the op_axes?

-Mark
 

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion