oneoffset arrays
Does anyone have a simple method of using oneoffset arrays? Specifically, can I define an array "a" so that a[1] refers to the first element? Without oneoffset indexing, it seems to me that Python is minimally useful for numerical computations. Many, perhaps the majority, of numerical algorithms are oneindexed. Matlab for example is onebased for this reason. In fact it seems strange to me that a "highlevel" language like Python should use zerooffset lists.
On Tue, 28 Aug 2001, Eric Nodwell wrote:
Does anyone have a simple method of using oneoffset arrays? Specifically, can I define an array "a" so that a[1] refers to the first element?
Inherit from the UserArray and redefine slicing to your hearts content.
Without oneoffset indexing, it seems to me that Python is minimally useful for numerical computations. Many, perhaps the majority, of numerical algorithms are oneindexed. Matlab for example is onebased for this reason. In fact it seems strange to me that a "highlevel" language like Python should use zerooffset lists.
Wow, that is a pretty condescendingsounding statement, though I'm sure you didn't mean it that way. Python is indeed being used for quite serious numerical computations. I have been using Python for quite some time for Numerical work and find it's zerobased indexing combined with the leavelastoneout slicing notation to be much more convenient. It is different from MATLAB but that hardly makes it less useful as I suspect you'd agree after trying it out for awhile. What is your application domain that so requires 1based indexing. Travis E. Oliphant, Ph.D. Assistant Professor Brigham Young University Provo, UT 84602 (801) 3783108 oliphant.travis@ieee.org
Hi all, I wrote last week looking for hints for writing a new clip function, that would be faster than the existing one. I got some suggestions, and have written the function. It works, but I'd love to get a little code review from the more experienced folks here. I have a few specific questions: A) Are there any tricks to making a function work with multiple types? I currently have it working with only double arrays, but it would be great for it ot be universal (and it could then be added to the main NumPy distribution, I suppose) I seem tohave to typecast the data to do the comparison I want, so I can't figure out how to make it general. For example, I have this line: if (*(double *)(Array>data + i*elementsize) > *(double *)(Max>data + i*elementsize)) { *(double *)(Array>data + i*elementsize) = *(double *)(Max>data + i*elementsize) ; } How could I do that without the explicit typecasts? I'm pretty sure I could make the rest of the routine general. B) I was trying to figure out how to loop through all the elements of a noncontiguous array. I got two hints: "Rob W. W. Hooft" wrote:
Never done this in C for numpy yet, but I normally program this along the following lines:
n=[2,3,2,3] x=[0,0,0,0] while 1: print x i=len(n)1 while i>=0 and x[i]+1==n[i]: x[i]=0 i=i1 if i<0: break x[i]=x[i]+1
It is always going to be slower than the straight loop for contiguous arrays.
This is what I did, and it is a LOT slower (about a factor of ten). See the enclosed C code. There has got to be a way to optimize this!
So if you want to do it properly, you'd have to check whether the array is contiguous, and if it is, use the fast way.
Since I had already written code for the contiguous case, it was easy to special case it. Pete Shinners wrote:
it's easiest to use recursion than nested loops. then you can support any number of dimensions. that should point you in the right direction :]
When I read this, I thought: "well, duh!", but then I couldn't figure out how to do it. Would there be any advantage over the above loops anyway? Is there either a more elgant or faster way to loop through all the elements of these three arrays? C) Where is the DL_EXPORT macro defined? I havn't been able to find it with grep yet. Does it work on all NumPy supported platforms? D) Is there an exising function or Macro for checking if two arrays are the same shape? I wrote one, but I imagine it exists. F) I needed to work with input that could be any Python number, or an Array of the right size. What I did was call: PyArray_FromObject on the min and max inputs, so that I would get a PyArray, then I could check if it was a ranko array, or the right size. I havn't been able to decide if this is a nifty trcik, or a kludge. Any other suggestions? E) Have I got reference counting right? I tried to Py_DECREF all my temp arrays, but I'm not sure I did it right. Sorry to enclose the code if you have no interest in it, but it's not really that big. Chris  Christopher Barker, Ph.D. ChrisHBarker@home.net    http://members.home.net/barkerlohmann @@ @@ @@ @@@ @@@ @@@ Oil Spill Modeling  @  @  @ Water Resources Engineering    Coastal and Fluvial Hydrodynamics   #include "Python.h" #include "arrayobject.h" // These are little macros I use to access array values for specific rank arrays #define ARRAYVAL0(aType,a) ( *(aType *)(a>data)) #define ARRAYVAL1(aType,a,i) ( *(aType *)(a>data + (i)*a>strides[0])) #define ARRAYVAL2(aType,a,i,j) ( *(aType *)(a>data + (i)*a>strides[0] + (j)*a>strides[1])) #define ARRAYVAL3(aType,a,i,j,k) ( *(aType *)(a>data + (i)*a>strides[0] + (j)*a>strides[1] + (k)*a>strides[2])) int AreSameShape(PyArrayObject *Array1, PyArrayObject *Array2){ int i; if (Array1>nd != Array2>nd){ return 0; } for (i = 0; i < Array1>nd; i++){ if (Array1>dimensions[i] != Array2>dimensions[i]){ return 0; } } return 1; } int ComputeOffset(PyArrayObject *Array, int *indexes){ int i, offset = 0; for (i = 0; i < Array>nd; i++) { offset += indexes[i]*Array>strides[i]; } return offset; } void ReplaceValue(PyArrayObject *Array, PyArrayObject *Min, PyArrayObject *Max, int *indexes){ // This function does the actual replacement of the values. // it allows Min and Max to be one element, or the same shape as Array // the offsets are computed separately, so as long as they are the same shape // it should work even if some are contiguous, and some not, for example double min, max; int offset; offset = ComputeOffset(Array,indexes); if (PyArray_Size((PyObject *)Min) == 1){ min = ARRAYVAL0(double,Min); } else { min = *(double *)(Min>data + ComputeOffset(Min,indexes)); } if (PyArray_Size((PyObject *)Max) == 1){ max = ARRAYVAL0(double,Max); } else { max = *(double *)(Max>data + ComputeOffset(Max,indexes)); } if (*(double *)(Array>data + offset) > max){ *(double *)(Array>data + offset) = max ; } else if (*(double *)(Array>data + offset) < min){ *(double *)(Array>data + offset) = min; } return; } static char doc_fastclip[] = "fastclip(a,min,max): a is clipped so that there are no values \n" "less than min, or greater than max.\n\n" "It only works on PyArrays of type Float.\n" "min and max can be either scalar or the same size as a.\n" "If a, min, and max are all contiguous (or scalar) then it is a LOT faster\n"; static PyObject * NumericExtras_fastclip(PyObject *self, PyObject *args) { int NumArray, elementsize, i; int *indexes; PyObject *InMin; PyObject *InMax; PyArrayObject *Array; PyArrayObject *Min; PyArrayObject *Max; if (!PyArg_ParseTuple(args, "O!OO", &PyArray_Type, &Array, &InMin, &InMax)) { return NULL; } // check types of input if (Array>descr>type_num != PyArray_DOUBLE){ PyErr_SetString (PyExc_ValueError, "a must be a NumPy array of type Float"); return NULL; } // convert min and max to double arrays: // if they don't convert, the input values are not valid // also check if they are the right size Min = (PyArrayObject *) PyArray_FromObject(InMin, PyArray_DOUBLE, 0, 0); if (Min == NULL){ PyErr_SetString (PyExc_ValueError, "min must be an object that can be converted to an array of Floats"); return NULL; } if (!((PyArray_Size((PyObject *)Min) == 1)  AreSameShape(Min, Array))){ PyErr_SetString (PyExc_ValueError, "min must be either a scalar or the same size as a"); Py_DECREF(Min); return NULL; } Max = (PyArrayObject *) PyArray_FromObject(InMax, PyArray_DOUBLE, 0, 0); if (Max == NULL){ PyErr_SetString (PyExc_ValueError, "max must be an object that can be converted to an array of Floats"); Py_DECREF(Min); return NULL; } if (!((PyArray_Size((PyObject *)Max) == 1)  AreSameShape(Max, Array))){ PyErr_SetString (PyExc_ValueError, "max must be either a scalar or the same size as a"); Py_DECREF(Min); Py_DECREF(Max); return NULL; } // After all that input checking, switch between the contiguous and noncontiguous cases. if (PyArray_ISCONTIGUOUS(Array) && PyArray_ISCONTIGUOUS(Min) && PyArray_ISCONTIGUOUS(Max)){ //we have the contiguous case // This seems pretty kludgy to have the four cases, but it was the first // thing that came to mind that I was sure would work, and would accomidate // either array or scalar arguments for min and max. // it's also faster than checking if they are scalar inside the loop NumArray = PyArray_Size((PyObject *)Array); elementsize = sizeof(double); if (PyArray_Size((PyObject *)Min) == 1 && PyArray_Size((PyObject *)Max) == 1){ // both limits are scalar for (i = 0; i < NumArray; i++) { // loop over each element if (*(double *)(Array>data + i*elementsize) > ARRAYVAL0(double,Max)){ *(double *)(Array>data + i*elementsize) = ARRAYVAL0(double,Max); } else if (*(double *)(Array>data + i*elementsize) < ARRAYVAL0(double,Min)){ *(double *)(Array>data + i*elementsize) = ARRAYVAL0(double,Min); } } } else if (PyArray_Size((PyObject *)Min) == 1 && PyArray_Size((PyObject *)Max) == NumArray){ // Min is scalar for (i = 0; i < NumArray; i++) { // loop over each element if (*(double *)(Array>data + i*elementsize) > *(double *)(Max>data + i*elementsize)){ *(double *)(Array>data + i*elementsize) = *(double *)(Max>data + i*elementsize) ; } else if (*(double *)(Array>data + i*elementsize) < ARRAYVAL0(double,Min)){ *(double *)(Array>data + i*elementsize) = ARRAYVAL0(double,Min); } } } else if (PyArray_Size((PyObject *)Max) == 1 && PyArray_Size((PyObject *)Min) == NumArray){ // Max is scalar for (i = 0; i < NumArray; i++) { // loop over each element if (*(double *)(Array>data + i*elementsize) > ARRAYVAL0(double,Max)){ *(double *)(Array>data + i*elementsize) = ARRAYVAL0(double,Max); } else if (*(double *)(Array>data + i*elementsize) < *(double *)(Min>data + i*elementsize)){ *(double *)(Array>data + i*elementsize) = *(double *)(Min>data + i*elementsize) ; } } } else if (PyArray_Size((PyObject *)Min) == NumArray && PyArray_Size((PyObject *)Max) == NumArray){ // Neither is scalar for (i = 0; i < NumArray; i++) { // loop over each element if (*(double *)(Array>data + i*elementsize) > *(double *)(Max>data + i*elementsize)){ *(double *)(Array>data + i*elementsize) = *(double *)(Max>data + i*elementsize) ; } else if (*(double *)(Array>data + i*elementsize) < *(double *)(Min>data + i*elementsize)){ *(double *)(Array>data + i*elementsize) = *(double *)(Min>data + i*elementsize) ; } } } else { // The size of Min and/or Max don't match Array: we should never get here, do to previous checks. PyErr_SetString (PyExc_ValueError, "min and max must be either scalar or the same shape as a"); Py_DECREF(Min); Py_DECREF(Max); return NULL; } } else { // this now the noncontiguous case: it is much slower, but presumably could be optimised : suggestions gratefully excepted! indexes = calloc(Array>nd,sizeof(int)); while (1){ ReplaceValue(Array, Min, Max, indexes); i = (Array>nd)  1; while ((i >= 0) && (indexes[i]+1 == Array>dimensions[i])){ indexes[i] = 0; i = 1; } if (i<0) { break; } indexes[i] = indexes[i]+1; } free(indexes); } Py_DECREF(Min); Py_DECREF(Max); return Py_None; } static PyMethodDef NumericExtrasMethods[] = { {"fastclip", NumericExtras_fastclip, METH_VARARGS, doc_fastclip}, {NULL, NULL} /* Sentinel */ }; DL_EXPORT (void) initNumericExtras(){ (void) Py_InitModule("NumericExtras", NumericExtrasMethods); import_array() }
On Wed, 29 Aug 2001, Chris Barker wrote:
A) Are there any tricks to making a function work with multiple types? I currently have it working with only double arrays, but it would be great for it ot be universal (and it could then be added to the main NumPy distribution, I suppose) I seem tohave to typecast the data to do the comparison I want, so I can't figure out how to make it general. For example, I have this line:
if (*(double *)(Array>data + i*elementsize) > *(double *)(Max>data + i*elementsize)) { *(double *)(Array>data + i*elementsize) = *(double *)(Max>data + i*elementsize) ; }
How could I do that without the explicit typecasts? I'm pretty sure I could make the rest of the routine general. Some time ago, I used a union like: typedef union { short *sdata; int *idata; long *ldata; float *fdata; double *ddata; /* And so on */ } MyNumPyUnion;
Then, you can assign (and CAST) the data field of the NumPy array according to the typecode of the array, like in: MyNumPyUnion theunion; theunion.fdata=(float*) thenumpyarray>data; /* If it is a Float32 */ Finally, you use sdata, idata, ldata, fdsata and so on to iterate dereferencing accordingly, depending on the typecode of the array. It is a nightmare to write this code, but I can not imagine of any other approach. May be other developers can help you with a better approach. Hope this helps. Jon Saenz.  Tfno: +34 946012470 Depto. Fisica Aplicada II  Fax: +34 944648500 Facultad de Ciencias. \\ Universidad del Pais Vasco \\ Apdo. 644 \\ 48080  Bilbao \\ SPAIN
participants (4)

Chris Barker

Eric Nodwell

Jon Saenz

Travis Oliphant