On Mon, Mar 9, 2009 at 5:37 PM, Darren Dale <dsdale24@gmail.com> wrote:

On Mon, Mar 9, 2009 at 9:50 AM, Darren Dale <dsdale24@gmail.com> wrote:
I spent some time over the weekend fixing a few bugs in numpy that were exposed when attempting to use ufuncs with ndarray subclasses. It got me thinking that, with relatively little work, numpy's functions could be made to be more general. For example, the numpy.ma module redefines many of the standard ufuncs in order to do some preprocessing before the builtin ufunc is called. Likewise, in the units/quantities package I have been working on, I would like to perform a dimensional analysis to make sure an operation is allowed before I call a ufunc that might change data in place.

Imagine an ndarray subclass with methods like __gfunc_pre__ and __gfunc_post__. __gfunc_pre__ could accept the context that is currently provided to __array_wrap__ (the inputs and the function called), perform whatever preprocessing is desired, and maybe return a dictionary containing metadata. Numpy functions could then be wrapped with a decorator that 1) calls __gfunc_pre__ and obtain any metadata that is returned 2) calls the wrapped functions, and then 3) calls __gfunc_post__, which might be very similar to __array_wrap__ except that it would also accept the metadata created by __gfunc_pre__.

In cases where the routines to be called by __gfunc_pre__ and _post__ depend on what function is called, the the subclass could implement routines and store them in a dictionary-like object that is keyed using the function called. I have been exploring this approach with Quantities and it seems to work well. For example:

    def __gfunc_pre__(self, gfunc, *args):
        try:
            return gfunc_pre_registry[gfunc](*args)
        except KeyError:
            return {}

I think such an approach for generalizing numpy's functions could be implemented without being disruptive to the existing __array_wrap__ framework. The decorator would attempt to identify an input or output array to use to call __gfunc_pre__ and _post__. If it finds them, it uses them. If it doesnt find them, no harm done, the existing __array_wrap__ mechanisms are still in place if the wrapped function is a ufunc.

One other nice feature: the metadata that is returned by __gfunc_pre__ could contain an optional flag that the decorator attempts to pass to the wrapped function so that __gfunc_pre__ and _post are not called for any decorated internal functions. That way the subclass could specify that __gfunc_pre__ and _post should be called only for the outer-most function.

Comments?

I'm attaching a proof of concept script, maybe it will better illustrate what I am talking about.