Re: [Numpy-discussion] Counting array elements

25 Oct 2004

      On 25 Oct 2004, at 19:32, Russell E Owen wrote:
...
At 7:08 PM +0200 2004-10-25, Peter Verveer wrote:
...
On 25 Oct 2004, at 18:51, Gary Strangman wrote:
...
...
I'm not sure how feasible it is, but I'd much rather an efficient, 
non-copying, 1-D view of an noncontiguous array (from an enhanced 
version of flat or ravel or whatever) than a bunch of extra 
methods. The former allows all of the standard methods to just work 
efficiently using sum(ravel(A)) or sum(A.flat) [ and max and min, 
etc]. Making special whole array methods for everything just leads 
to method eplosion.
I completely agree with this ... an efficient flat/ravel would seem 
to solve many of the issues being raised. Forgive the potentially 
naive question here, but is there any reason such an efficient, 
enhanced view can't be implemented for the .flat method?
I believe it is not possible without copying data. The strides 
between elements of a noncontiguous array are not always the same, so 
you cannot efficiently view it as a 1D array.
How about providing an iterator that counts through all the elements 
of an array (e.g. arr.itervalues()). So long as C extensions could 
efficiently make use of such an iterator, I think it'd do the job.
It would still be slower, because you would need a function call at 
each element that returns a value. Not a problem if you do a lot of 
work at each element, but if you are just adding values you want a 
custom written C function. You can do it a the C level with macros or 
so, (I do that in nd_image) but that would not help at the python 
level.
...
One could also imagine:
- arr.iteritems(), which returned (index, value) for each item
- a mask argument: a boolean array the same shape as the data array; 
True means elide the corresponding value from the data array
- general support for indexing
Essentially you are suggesting to expose iterators at the python level 
that iterate over an array in some predefined way. That is possible, 
but I doubt it will be efficient.

At the C level however, it might be worth thinking about as a way of 
easing writing functions in C. I proposed to do it the other way around 
in an earlier mail: providing a set of generic functions that take a 
python or a C function to be applied at each element. I most likely 
will implement something in that direction, but I should give your idea 
also some thought.
...
More generally, I agree that sum should work the same as a function 
and a method, and that an extra axis argument could be a good thing 
(it is so common elsewhere, e.g. size). I'd be tempted to break 
backwards compatibility to fix this, since numarray is still new and 
the current situation is very confusing.
I would absolutely vote for such a change. Simply because we would like 
a range of such functions, e.g. minimum, maximum, and so on. Even if we 
have to leave sum() as it is, I think we should have the alternatives, 
we would just have to come up with an alternative name for sum(). In 
fact I would consider volunteering implementing these functions.

Peter