[Numpy-discussion] Counting array elements
Chris Barker
Chris.Barker at noaa.gov
Thu Nov 4 14:58:05 EST 2004
Todd Miller wrote:
>> What I was suggesting is that
>>there should be an API for accessing the elements of an array that
>>doesn't rely on the standard strides approach. I guess I'm expressing my
>>disappointment that PyArrays don't follow one of the axioms of Object
>>Oriented Programming: Encapsulation. I should be able to get element
>>(i,j) of an array without knowing the data structures used to store the
>>data.
>
> (I think) numarray has what you're talking about: the "element-wise
> API". It's documented in the manual but AFIK is fairly slow and
> probably not widely used.
>
Well, the "fairly slow" is the issue. Along with the not widely used.
>>If we had that, then there could be a 1-d "flat" array that
>>supported discontiguous arrays in a different way than the strides
>>approach, while sharing the same data block as the parent N-d array.
>
>
> The numarray "element-wise" API makes use of strides internally in order
> to access array elements; it does, however, hide what it's doing.
I'm no C wiz, but by being macros, it looks to me like they very much
depend on the PyArrayObject that is passed in storing it's data with
strides, etc. anyway, so they couldn't be used with an object with a
different storage scheme.
> I
> don't understand the approach you're suggesting here though. Can you
> elaborate?
What I'm getting at is classic OO polymorphism: An Array class that has
a GetElement1d(i) method that returns the element. This class could then
be replaced with another class that uses a completely different internal
storage mechanism, but still has a GetElement1d(i) method. I know we're
working with C, rather than C++ here, but I think this kind of thing
could be faked with enough typecasting. On the other hand, I don't know
what the heck I'm talking about. I'm no C wiz.
Your comment about performance above is key, however. If this approach
has worse performance than doing pointer arithmetic by hand with
Array->strides et al, then it wouldn't get used universally, and we'd be
back were we started. I know even less C++ than C, but I think perhaps
the only way to get this with adequate performance would be to do a lot
of C++ template magic, like Blitz++.
In the early days of numarray development, there was discussion about
using Blitz++ (or other nifty C++ template based arrays) as the basis
for numarray. I think it all really boiled down to the template magic
required was not well supported by enough compilers, so it couldn't be
used. I think that's a shame, as while I haven't used C++ much,
templates an iterators and all look very appealing, and much better than
all the hassles of pointer arithmetic an static typing of C.
>>Anyway, I'm just dreaming, I suppose, we're pretty committed to the
>>current approach!
>
> Good ideas have a way of getting adopted, so dream on...
>
Well, yes, but the core API of numarray is pretty well established by now.
>>Very cool! I'm still using Numeric, but I think next time I need to
>>write my own Ufunc extension, this may be what converts me!
By the way, the two reasons I still use numeric, other than inertia, are:
1) slower small array performance: I use arrays a lot for the
convenience, rather than just when I have large arrays and need the
performance.
2) Much slower performance when passing arrays into wxPython, due to
wxPython using the generic sequence interface, which is apparently much
slower with numarray than Numeric. Has this changed?
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
NOAA/OR&R/HAZMAT (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
More information about the NumPy-Discussion
mailing list