[Numpy-discussion] Counting array elements
Peter Verveer
verveer at embl-heidelberg.de
Mon Oct 25 10:42:03 EDT 2004
> Stephen Walton wrote:
>>> - I'd like to write C/C++ code that would work on multiple array
>>> types.
>> I can't help much here, other than to say that C and C++ are pretty
>> low
>> level languages, not well suited for this level of abstraction.
>
> Well, this is certainly true for C, but not so much for C++. I'm not
> expert, but C++ templates could be very handy here. When the numarray
> projects was just getting started, there was some discussion about
> using a template-based array package as the base, perhaps Blitz++. I
> still this this was a great idea, but I think the biggest issue at the
> time was that templates were still not constantly well supported by
> the wide variety of compilers that numarray should work with.
> Personally I think that anything supported by gcc should be fine, as
> anyone can use gcc on virtually any platform, if they want.
I think having the option of using C++ would be cool. But as soon as we
would 'require' it, I would not develop for numarray anymore. C++ is a
big pain in my opinion, although I do agree that a well written
templating system like Blitz++ is nice if you actually use C++.
> Anyway, it's too late to re-write numarray, but maybe a numarray <-->
> blitz++ conversion package would make it easy to write numarray
> extensions with blitz++. Perhaps even integrate it with Boost.Python.
> Another option would be to write a template-based wrapper around the
> existing Numarray objects.
yes, it would be nice to have the option. There is no reason why there
could not be a C++ API which would include the use of templates layered
on top of the current C API for those people that would like to use it.
> By the way, my other issue with extensions is the difficulty of
> writing extensions that support discontinuous arrays, in addition to
> multiple data types. It seems someone smarter than me could use C++
> classes to solve this one as well.
I had to deal with that problem too in nd_image. It is doable, albeit
ugly if you depend on plain C. Probably C++ could do it differently and
more nicely, Blitz++ possible does. Again, not for me.
> Peter Verveer wrote:
>
>> But I do agree that it is not a good idea to introduce another set of
>> names. In my opinion functions that calculate a statistic like sum
>> should return the total in the first place, rather then over a single
>> axis.
>
> Absolutely not! I'm far more likely to want it over a single axis,
> it's the core of "vectorizing" your code. If the data are mean the
> same thing, why aren't you storing it in a 1-d array?
I agree that it is important, I am just saying that both are very
common operations. Why not support operations over an axis by a
optional argument, you will often have to specify which axis you want
anyway.
> That being said, it should be easy to do various reductions over all
> axis, which I think .flat() does nicely. I thought .flat() never made
> a copy: am I wrong?
Unfortunately, flattening an array is not always possible without
copying, due to the fact that arrays may be not contiguous in memory.
> Tim Hochberg wrote:
>> I'm not sure how feasible it is, but I'd much rather an efficient,
>> non-copying, 1-D view of an noncontiguous array (from an enhanced
>> version of flat or ravel or whatever) than a bunch of extra methods.
>> The former allows all of the standard methods to just work
>> efficiently using sum(ravel(A)) or sum(A.flat) [ and max and min,
>> etc]. Making special whole array methods for everything just leads to
>> method eplosion.
>
> here! here! I thought that was exactly what .flat() was for. Shows
> what I know!
It is however not feasible I think to do it efficiently. It seems to me
that a set of functions is necessary to do things like sum, minimum and
so on, that work on the whole array. I would also prefer they are not
methods. Introducing a whole array of sum_all() like functions is also
not great.
Cheers, Peter
More information about the NumPy-Discussion
mailing list