Re: Re-implementation of Python Numerical arrays (Numeric) available for download

Perry Greenfield wrote:
This sounds great! The code generting machinery sound very promising, and examples are, of course, key. I found digging through the NumPy source to figure out how to do things very treacherous. Making writing Ufuncs easy will enocourage a lot more C Ufuncs to be written which should help perfomance.
The wheel I'm talking about are multi-dimensional array objects...
I know, I just think using an existing set of C++ classes for multiple typed multidimansional arrays would make sense, although I imagine it is too late now!
If the issue is why we are redoing Numeric:
Actually, I think I had a pretty good idea why you were working on this.
I'm particualry excited about 1) and 4)
I used Poor wording. When I wrote "datatypes", I meant data types in a much higher order sense. Perhaps structures or classes would be a better term. What I mean is that is should be easy to use an manipulate the same multidimensional arrays from both Python and C/C++. In the current Numeric, most folks generate a contiguous array, and then just use the array->data pointer to get what is essentially a C array. That's fine if you are using it in a traditional C way, with fixed dimension, one datatype, etc. What I'm imagining is having an object in C or C++ that could be easily used as a multidimentional array. I'm thinking C++ would probably neccesary, and probably templates as well, which is why blitz++ looked promising. Of course, blitz++ only compiles with a few up-to-date compilers, so you'd never get it into the standard library that way! This could also lead the way to being able to compile NumPy code....<end fantasy>
I think it is pretty easy to install since it use distutils.
I agree, but from the newsgroup, it is clear that a lot of folks are very reluctant to use something that is not part of the standard library.
As fast as current Numeric would be "good enough" for me. It would be a shame to go backwards in performance!
(IDL does much better than that for example).
My personal benchmark is MATLAB, which I imagine is similar to IDL in performance.
Well, sure, I'm not expecting that
100, maybe, but will be very hard. 1000 should be possible with some work.
I suppose MATLAB has it easier, as all arrays are doubles, and, (untill recently anyway), all variable where arrays, and all arrays were 2-d. NumPy is a lot more flexible that that. Is is the type and size checking that takes the time?
You are probably right about that.
I do that when possible, but it's not always possible.
One of the things I do a lot with are coordinates of points and polygons. Sets if points I can handle easily as an NX2 array, but polygons don't work so well, as each polgon has a different number of points, so I use a list of arrays, which I have to loop over. Each polygon can have from about 10 to thousands of points (mostly 10-20, however). One way I have dealt with this is to store a polygon set as a large array of all the points, and another array with the indexes of the start and end of each polygon. That way I can transform the coordinates of all the polygons in one operation. It works OK, but sometimes it is more useful to have them in a sequence.
I know large datasets were one of your driving factors, but I really don't want to make performance on smaller datasets secondary. I hope I'll get a chance to play with it soon.... -Chris -- Christopher Barker, Ph.D. ChrisHBarker@home.net --- --- --- http://members.home.net/barkerlohmann ---@@ -----@@ -----@@ ------@@@ ------@@@ ------@@@ Oil Spill Modeling ------ @ ------ @ ------ @ Water Resources Engineering ------- --------- -------- Coastal and Fluvial Hydrodynamics -------------------------------------- ------------------------------------------------------------------------
participants (1)
-
Chris Barker