[Numpy-discussion] Counting array elements
Peter Verveer
verveer at embl-heidelberg.de
Mon Oct 25 11:04:01 EDT 2004
On 25 Oct 2004, at 19:32, Russell E Owen wrote:
> At 7:08 PM +0200 2004-10-25, Peter Verveer wrote:
>> On 25 Oct 2004, at 18:51, Gary Strangman wrote:
>>
>>>
>>>> I'm not sure how feasible it is, but I'd much rather an efficient,
>>>> non-copying, 1-D view of an noncontiguous array (from an enhanced
>>>> version of flat or ravel or whatever) than a bunch of extra
>>>> methods. The former allows all of the standard methods to just work
>>>> efficiently using sum(ravel(A)) or sum(A.flat) [ and max and min,
>>>> etc]. Making special whole array methods for everything just leads
>>>> to method eplosion.
>>>
>>> I completely agree with this ... an efficient flat/ravel would seem
>>> to solve many of the issues being raised. Forgive the potentially
>>> naive question here, but is there any reason such an efficient,
>>> enhanced view can't be implemented for the .flat method?
>>
>> I believe it is not possible without copying data. The strides
>> between elements of a noncontiguous array are not always the same, so
>> you cannot efficiently view it as a 1D array.
>
> How about providing an iterator that counts through all the elements
> of an array (e.g. arr.itervalues()). So long as C extensions could
> efficiently make use of such an iterator, I think it'd do the job.
It would still be slower, because you would need a function call at
each element that returns a value. Not a problem if you do a lot of
work at each element, but if you are just adding values you want a
custom written C function. You can do it a the C level with macros or
so, (I do that in nd_image) but that would not help at the python
level.
> One could also imagine:
> - arr.iteritems(), which returned (index, value) for each item
> - a mask argument: a boolean array the same shape as the data array;
> True means elide the corresponding value from the data array
> - general support for indexing
Essentially you are suggesting to expose iterators at the python level
that iterate over an array in some predefined way. That is possible,
but I doubt it will be efficient.
At the C level however, it might be worth thinking about as a way of
easing writing functions in C. I proposed to do it the other way around
in an earlier mail: providing a set of generic functions that take a
python or a C function to be applied at each element. I most likely
will implement something in that direction, but I should give your idea
also some thought.
> More generally, I agree that sum should work the same as a function
> and a method, and that an extra axis argument could be a good thing
> (it is so common elsewhere, e.g. size). I'd be tempted to break
> backwards compatibility to fix this, since numarray is still new and
> the current situation is very confusing.
I would absolutely vote for such a change. Simply because we would like
a range of such functions, e.g. minimum, maximum, and so on. Even if we
have to leave sum() as it is, I think we should have the alternatives,
we would just have to come up with an alternative name for sum(). In
fact I would consider volunteering implementing these functions.
Peter
More information about the NumPy-Discussion
mailing list