[Numpy-discussion] Counting array elements

Peter Verveer verveer at embl-heidelberg.de
Mon Oct 25 11:04:01 EDT 2004


On 25 Oct 2004, at 19:32, Russell E Owen wrote:

> At 7:08 PM +0200 2004-10-25, Peter Verveer wrote:
>> On 25 Oct 2004, at 18:51, Gary Strangman wrote:
>>
>>>
>>>>  I'm not sure how feasible it is, but I'd much rather an efficient, 
>>>> non-copying, 1-D view of an noncontiguous array (from an enhanced 
>>>> version of flat or ravel or whatever) than a bunch of extra 
>>>> methods. The former allows all of the standard methods to just work 
>>>> efficiently using sum(ravel(A)) or sum(A.flat) [ and max and min, 
>>>> etc]. Making special whole array methods for everything just leads 
>>>> to method eplosion.
>>>
>>>  I completely agree with this ... an efficient flat/ravel would seem 
>>> to solve many of the issues being raised. Forgive the potentially 
>>> naive question here, but is there any reason such an efficient, 
>>> enhanced view can't be implemented for the .flat method?
>>
>> I believe it is not possible without copying data. The strides 
>> between elements of a noncontiguous array are not always the same, so 
>> you cannot efficiently view it as a 1D array.
>
> How about providing an iterator that counts through all the elements 
> of an array (e.g. arr.itervalues()). So long as C extensions could 
> efficiently make use of such an iterator, I think it'd do the job.

It would still be slower, because you would need a function call at 
each element that returns a value. Not a problem if you do a lot of 
work at each element, but if you are just adding values you want a 
custom written C function. You can do it a the C level with macros or 
so, (I do that in nd_image) but that would not help at the python 
level.

> One could also imagine:
> - arr.iteritems(), which returned (index, value) for each item
> - a mask argument: a boolean array the same shape as the data array; 
> True means elide the corresponding value from the data array
> - general support for indexing

Essentially you are suggesting to expose iterators at the python level 
that iterate over an array in some predefined way. That is possible, 
but I doubt it will be efficient.

At the C level however, it might be worth thinking about as a way of 
easing writing functions in C. I proposed to do it the other way around 
in an earlier mail: providing a set of generic functions that take a 
python or a C function to be applied at each element. I most likely 
will implement something in that direction, but I should give your idea 
also some thought.

> More generally, I agree that sum should work the same as a function 
> and a method, and that an extra axis argument could be a good thing 
> (it is so common elsewhere, e.g. size). I'd be tempted to break 
> backwards compatibility to fix this, since numarray is still new and 
> the current situation is very confusing.

I would absolutely vote for such a change. Simply because we would like 
a range of such functions, e.g. minimum, maximum, and so on. Even if we 
have to leave sum() as it is, I think we should have the alternatives, 
we would just have to come up with an alternative name for sum(). In 
fact I would consider volunteering implementing these functions.

Peter





More information about the NumPy-Discussion mailing list