
On Sat, Oct 18, 2014 at 6:46 PM, Nathaniel Smith <njs@pobox.com> wrote:
One thing we'll have to watch out for is that for reduction operations (which are basically gufuncs with (n)->() signatures), we already allow axis=(0,1) to mean "reshape axes 0 and 1 together into one big axis, and then use that as the gufunc core axis". I don't know if we'll ever want to support this functionality for gufuncs in general, but we shouldn't rule it out with the syntax.
This is a great point. In fact, I think supporting this sort of functionality for gufuncs would be quite valuable, since there are a plenty of reduction operations that can't fit into the model provided by ufunc.reduce. An excellent example is np.median, which currently can only act on either one axis or an entire flattened array. If the syntax (m?,n),(n,p?)->(m?,p?) is accepted, then I think the natural extension to reduction operators that can act on one or more axes would be (n+)->() (this is regex syntax). Actually, adding using an axis keyword seems like the only elegant way to handle disambiguating cases like this.
One option would be to add a new argument axes=... for gufunc core specification, and say that axis=foo is an alias for axes=[[foo]].
Indeed, this is exactly what I was thinking. The "canonical form" for the axis argument would be doubly nested tuples, but if an integer or unnested tuple is encountered, additional nesting should be added until reaching canoncial form, e.g., axis=0 -> axis=(0,) -> axis=((0,),). The only particularly tricky case will be scenarios like my second one, axis=(0, 1) for (n)(m)->() or (n,m)->(). To deal with cases like this, the parsing will need to take the gufunc signature into consideration, and start by asking whether or not tuple is of the right size to match each function argument separately. To make it clear that this proposal covers all the bases, I would be happy to write some prototype code (and test cases) to demonstrate such a transformation to canonical form, including all these edge cases. Cheers, Stephan