[Numpy-discussion] Counting array elements

Thu Nov 4 14:58:05 EST 2004

Todd Miller wrote:
 >> What I was suggesting is that
>>there should be an API for accessing the elements of an array that 
>>doesn't rely on the standard strides approach. I guess I'm expressing my 
>>disappointment that PyArrays don't follow one of the axioms of Object 
>>Oriented Programming: Encapsulation. I should be able to get element 
>>(i,j) of an array without knowing the data structures used to store the 
>>data. 
> 
> (I think) numarray has what you're talking about:  the "element-wise
> API".  It's documented in the manual but AFIK is fairly slow and
> probably not widely used.
> 

Well, the "fairly slow" is the issue. Along with the not widely used.
>>If we had that, then there could be a 1-d "flat" array that 
>>supported discontiguous arrays in a different way than the strides 
>>approach, while sharing the same data block as the parent N-d array.
> 
> 
> The numarray "element-wise" API makes use of strides internally in order
> to access array elements;  it does, however, hide what it's doing.

I'm no C wiz, but by being macros, it looks to me like they very much 
depend on the PyArrayObject that is passed in storing it's data with 
strides, etc. anyway, so they couldn't be used with an object with a 
different storage scheme.

 >  I
 > don't understand the approach you're suggesting here though.  Can you
 > elaborate?

What I'm getting at is classic OO polymorphism: An Array class that has 
a GetElement1d(i) method that returns the element. This class could then 
be replaced with another class that uses a completely different internal 
storage mechanism, but still has a GetElement1d(i) method. I know we're 
working with C, rather than C++ here, but I think this kind of thing 
could be faked with enough typecasting. On the other hand, I don't know 
what the heck I'm talking about. I'm no C wiz.

Your comment about performance above is key, however. If this approach 
has worse performance than doing pointer arithmetic by hand with 
Array->strides et al, then it wouldn't get used universally, and we'd be 
back were we started. I know even less C++ than C, but I think perhaps 
the only way to get this with adequate performance would be to do a lot 
of C++ template magic, like Blitz++.

In the early days of numarray development, there was discussion about 
using Blitz++ (or other nifty C++ template based arrays) as the basis 
for numarray. I think it all really boiled down to the template magic 
required was not well supported by enough compilers, so it couldn't be 
used. I think that's a shame, as while I haven't used C++ much, 
templates an iterators and all look very appealing, and much better than 
all the hassles of pointer arithmetic an static typing of C.

>>Anyway, I'm just dreaming, I suppose, we're pretty committed to the 
>>current approach!
> 
> Good ideas have a way of getting adopted, so dream on...
> 

Well, yes, but the core API of numarray is pretty well established by now.

>>Very cool! I'm still using Numeric, but I think next time I need to 
>>write my own Ufunc extension, this may be what converts me!

By the way, the two reasons I still use numeric, other than inertia, are:

1) slower small array performance: I use arrays a lot for the 
convenience, rather than just when I have large arrays and need the 
performance.

2) Much slower performance when passing arrays into wxPython, due to 
wxPython using the generic sequence interface, which is apparently much 
slower with numarray than Numeric. Has this changed?

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer

NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov