[Numpy-discussion] New functions.

Wed Jun 1 11:17:55 EDT 2011

On Tue, May 31, 2011 at 8:41 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> On Tue, May 31, 2011 at 8:50 PM, Bruce Southey <bsouthey at gmail.com> wrote:

>> How about including all or some of Keith's Bottleneck package?
>> He has tried to include some of the discussed functions and tried to
>> make them very fast.
>
> I don't think they are sufficiently general as they are limited to 2
> dimensions. However, I think the moving filters should go into scipy, either
> in ndimage or maybe signals. Some of the others we can still speed of
> significantly, for instance nanmedian, by using the new functionality in
> numpy, i.e., numpy sort has worked with nans for a while now. It looks like
> call overhead dominates the nanmax times for small arrays and this might
> improve if the ufunc machinery is cleaned up a bit more, I don't know how
> far Mark got with that.

Currently Bottleneck accelerates 1d, 2d, and 3d input. Anything else
falls back to a slower, non-cython version of the function. The same
goes for int32, int64, float32, float64.

It should not be difficult to extend to higher nd and more dtypes
since everything is generated from template. The problem is that there
would be a LOT of cython auto-generated C code since there is a
separate function for each ndim, dtype, axis combination.

Each of the ndim, dtype, axis functions currently has its own copy of
the algorithm (such as median). Pulling that out and reusing it should
save a lot of trees by reducing the auto-generated C code size.

I recently added a partsort and argpartsort.