On Wed, Aug 6, 2014 at 8:32 AM, Jaime Fernández del Río <jaime.frio@gmail.com> wrote:
On Wed, Aug 6, 2014 at 5:31 AM, Nathaniel Smith <njs@pobox.com> wrote:
I think the other obvious strategy to consider, is defining a 'dot'
gufunc, with semantics identical to @. (This would be useful for
backcompat as well: adding/dropping compatibility with older python
versions would be as simple as mechanically replacing a @ b with
newdot(a, b) or vice-versa.) This would require one new feature in the
gufunc machinery: support for "optional core axes", to get the right
semantics for 1d arrays.

Can you elaborate on what those optional core axes would look like? If I am understanding you correctly, this is what now is solved by having more than one gufunc defined, and choosing which one to use based on the input's shapes in a thin Python wrapper. There are several examples in the linalg module you are certainly well aware of.

Say you could define the matmul signature as "(i)j,j(k)->(ik)", with dimensions in parenthesis being "optional." Say we modified the gufunc machinery to detect which optional core axes are present and which not. It seems to me that you would then still need to write 4 traditional gufuncs (ij,jk->ik, j,jk->k, ij,j->i, j,j->) and dispatch to one of them. I haven't thought it through, but are there really a set of universal dispatch rules that will apply to any optional core axes problem? Would we not be losing flexibility in doing so?

When I looked into gufuncs several months ago, what I missed was a way of defining signatures like n,m->n*(n-1), which would come in very handy if computing all pairwise distances. You can work around this by making the signature n,m->p and always calling the gufunc from a Python wrapper that passes in an out parameter of the right shape. But if someone gets a hold of the gufunc handle and calls it directly without an out parameter, the p defaults to 1 and you are probably in for a big crash. So it would be nice if you could provide a pointer to a function to produce the output shape based on the inputs'.

On my wish list for gufunc signatures there is also frozen dimensions, e.g. a gufunc to compute greater circle distances on a sphere can be defined as m,m->, but m has to be 2, and since you don't typically want to be raising errors in the kernel, a Python wrapper is once more necessary. And again an unwrapped call to the gufunc is potentially catastrophic.

Sorry for hijacking the thread, but I wouldn't mind spending some time working on expanding this functionality to include the optional axes and my wish-list, if the whole thing makes sense.

Should also mention that we don't have the ability to operate on stacked vectors because they can't be identified by dimension info. One workaround is to add dummy dimensions where needed, another is to add two flags, row and col, and set them appropriately. Two flags are needed for backward compatibility, i.e., both false is a traditional array. Note that adding dummy dimensions can lead to '[[...]]' scalars. Working with stacked vectors isn't part of the '@' PEP.

Chuck