On Wed, 2004-06-30 at 15:57, Tim Hochberg wrote:
I spend some time seeing what I could do in the way of speeding up wxPoint_LIST_helper by tweaking the numarray code. My first suspect was _universalIndexing by way of _ndarray_item. However, due to some new-style machinations, _ndarray_item was never getting called. Instead, _ndarray_subscript was being called. So, I added a special case to _ndarray_subscript. This sped things up by 50% or so (I don't recall exactly). The code for that is at the end of this message; it's not gauranteed to be 100% correct; it's all experimental.
After futzing around some more I figured out a way to trick python into using _ndarray_item. I added "type->tp_as_sequence->sq_item = _ndarray_item;" to _ndarray new.
I'm puzzled why you had to do this. You're using Python-2.3.x, right? There's conditionally compiled code which should be doing this statically. (At least I thought so.)
I then optimized _ndarray_item (code at end). This halved the execution time of my arbitrary benchmark. This trick may have horrible, unforseen consequences so use at your own risk.
Right now the sq_item hack strikes me as somewhere between completely unnecessary and too scary for me! Maybe if python-dev blessed it. This optimization looks good to me.
Finally I commented out the __del__ method numarraycore. This resulted in an additional speedup of 64% for a total speed up of 240%. Still not close to 10x, but a large improvement. However, this is obviously not viable for real use, but it's enough of a speedup that I'll try to see if there's anyway to move the shadow stuff back to tp_dealloc.
FYI, the issue with tp_dealloc may have to do with which mode Python is compiled in, --with-pydebug, or not. One approach which seems like it ought to work (just thought of this!) is to add an extra reference in C to the NumArray instance __dict__ (from NumArray.__init__ and stashed via a new attribute in the PyArrayObject struct) and then DECREF it as the last part of the tp_dealloc.
In summary:
Version Time Rel Speedup Abs Speedup Stock 0.398 ---- ---- _naarray_item mod 0.192 107% 107% del __del__ 0.117 64% 240%
There were a couple of other things I tried that resulted in additional small speedups, but the tactics I used were too horrible to reproduce here. The main one of interest is that all of the calls to NA_updateDataPtr seem to burn some time. However, I don't have any idea what one could do about that.
Francesc Alted had the same comment about NA_updateDataPtr a while ago. I tried to optimize it then but didn't get anywhere. NA_updateDataPtr() should be called at most once per extension function (more is unnecessary but not harmful) but needs to be called at least once as a consequence of the way the buffer protocol doesn't give locked pointers.
That's all for now.
-tim
Well, be picking out your beer. Todd
static PyObject* _ndarray_subscript(PyArrayObject* self, PyObject* key)
{ PyObject *result; #ifdef TAH if (PyInt_CheckExact(key)) { long ikey = PyInt_AsLong(key); long offset; if (NA_getByteOffset(self, 1, &ikey, &offset) < 0) return NULL; if (!NA_updateDataPtr(self)) return NULL; return _simpleIndexingCore(self, offset, 1, Py_None); } #endif #if _PYTHON_CALLBACKS result = PyObject_CallMethod( (PyObject *) self, "_universalIndexing", "(OO)", key, Py_None); #else result = _universalIndexing(self, key, Py_None); #endif return result; }
static PyObject * _ndarray_item(PyArrayObject *self, int i) { #ifdef TAH long offset; if (NA_getByteOffset(self, 1, &i, &offset) < 0) return NULL; if (!NA_updateDataPtr(self)) return NULL; return _simpleIndexingCore(self, offset, 1, Py_None); #else PyObject *result; PyObject *key = PyInt_FromLong(i); if (!key) return NULL; result = _universalIndexing(self, key, Py_None); Py_DECREF(key); return result; #endif }
------------------------------------------------------- This SF.Net email sponsored by Black Hat Briefings & Training. Attend Black Hat Briefings & Training, Las Vegas July 24-29 - digital self defense, top technical experts, no vendor pitches, unmatched networking opportunities. Visit www.blackhat.com _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
--