[Numpy-discussion] Counting array elements

Peter Verveer verveer at embl-heidelberg.de
Mon Oct 25 10:42:03 EDT 2004


> Stephen Walton wrote:
>>> - I'd like to write C/C++ code that would work on multiple array 
>>> types.
>> I can't help much here, other than to say that C and C++ are pretty 
>> low
>> level languages, not well suited for this level of abstraction.
>
> Well, this is certainly true for C, but not so much for C++. I'm not 
> expert, but C++ templates could be very handy here. When the numarray 
> projects was just getting started, there was some discussion about 
> using a template-based array package as the base, perhaps Blitz++. I 
> still this this was a great idea, but I think the biggest issue at the 
> time was that templates were still not constantly well supported by 
> the wide variety of compilers that numarray should work with. 
> Personally I think that anything supported by gcc should be fine, as 
> anyone can use gcc on virtually any platform, if they want.

I think having the option of using C++ would be cool. But as soon as we 
would 'require' it, I would not develop for numarray anymore. C++ is a 
big pain in my opinion, although I do agree that a well written 
templating system like Blitz++ is nice if you actually use C++.

> Anyway, it's too late to re-write numarray, but maybe a numarray <--> 
> blitz++ conversion package would make it easy to write numarray 
> extensions with blitz++. Perhaps even integrate it with Boost.Python. 
> Another option would be to write a template-based wrapper around the 
> existing Numarray objects.

yes, it would be nice to have the option. There is no reason why there 
could not be a C++ API which would include the use of templates layered 
on top of the current C API for those people that would like to use it.

> By the way, my other issue with extensions is the difficulty of 
> writing extensions that support discontinuous arrays, in addition to 
> multiple data types. It seems someone smarter than me could use C++ 
> classes to solve this one as well.

I had to deal with that problem too in nd_image. It is doable, albeit 
ugly if you depend on plain C. Probably C++ could do it differently and 
more nicely, Blitz++ possible does. Again, not for me.

> Peter Verveer wrote:
>
>> But I do agree that it is not a good idea to introduce another set of 
>> names. In my opinion functions that calculate a statistic like sum 
>> should return the total in the first place, rather then over a single 
>> axis.
>
> Absolutely not! I'm far more likely to want it over a single axis, 
> it's the core of "vectorizing" your code. If the data are mean the 
> same thing, why aren't you storing it in a 1-d array?

I agree that it is important, I am just saying that both are very 
common operations. Why not support operations over an axis by a 
optional argument, you will often have to specify which axis you want 
anyway.

> That being said, it should be easy to do various reductions over all 
> axis, which I think .flat() does nicely. I thought .flat() never made 
> a copy: am I wrong?

Unfortunately, flattening an array is not always possible without 
copying, due to the fact that arrays may be not contiguous in memory.

> Tim Hochberg wrote:
>> I'm not sure how feasible it is, but I'd much rather an efficient, 
>> non-copying, 1-D view of an noncontiguous array (from an enhanced 
>> version of flat or ravel or whatever) than a bunch of extra methods. 
>> The former allows all of the standard methods to just work 
>> efficiently using sum(ravel(A)) or sum(A.flat) [ and max and min, 
>> etc]. Making special whole array methods for everything just leads to 
>> method eplosion.
>
> here! here! I thought that was exactly what .flat() was for. Shows 
> what I know!

It is however not feasible I think to do it efficiently. It seems to me 
that a set of functions is necessary to do things like sum, minimum and 
so on, that work on the whole array. I would also prefer they are not 
methods. Introducing a whole array of sum_all() like functions is also 
not great.

Cheers, Peter





More information about the NumPy-Discussion mailing list