[Numpy-discussion] Allowing broadcasting of code dimensions in generalized ufuncs

Thu May 31 09:35:09 EDT 2018

On 05/31/2018 12:10 AM, Nathaniel Smith wrote:
> On Wed, May 30, 2018 at 11:14 AM, Marten van Kerkwijk
> <m.h.vankerkwijk at gmail.com> wrote:
>> Hi All,
>>
>> Following on a PR combining the ability to provide fixed and flexible
>> dimensions [1] (useful for, e.g., 3-vector input with a signature like
>> `(3),(3)->(3)`, and for `matmul`, resp.; based on earlier PRs by Jaime
>> [2] and Matt (Picus) [3]), I've now made a PR with a further
>> enhancement, which allows one can indicate that a core dimension can
>> be broadcast [4].
>>
>> A particular use case is `all_equal`, a new function suggested in a
>> stalled PR by Matt (Harrigan) [5], which compares two arrays
>> axis-by-axis, but short-circuits if a non-equality is found (unlike
>> what is the case if one does `(a==b).all(axis)`). One thing that would
>> be obviously useful for a routine like `all_equal` is to be able to
>> provide an array as one argument and a constant as another, i.e., if
>> the core dimensions can be broadcast if needed, just like they are in
>> `(a==b).all(axis)`. This is currently not possible: with its signature
>> of `(n),(n)->()`, the two arrays have to have the same trailing size.
>>
>> My PR provides the ability to indicate in the signature that a core
>> dimension can be broadcast, by using a suffix of "|1". Thus, the
>> signature of `all_equal` would become:
>>
>> ```
>> (n|1),(n|1)->()
>> ```
>>
>> Comments most welcome (yes, even on the notation - though I think it
>> is fairly self-explanatory)!
> 
> I'm currently -0.5 on both fixed dimensions and this broadcasting
> dimension idea. My reasoning is:
> 
> - The use cases seem fairly esoteric. For fixed dimensions, I guess
> the motivating example is cross-product (are there any others?). But
> would it be so bad for a cross-product gufunc to raise an error if it
> receives the wrong number of dimensions? For this broadcasting case...
> well, obviously we've survived this long without all_equal :-). And
> there's something funny about all_equal, since it's really smushing
> together two conceptually separate gufuncs for efficiency. Should we
> also have all_less_than, sum_square, ...? If this is a big problem,
> then wouldn't it be better to solve it in a general way, like dask or
> Numba or numexpr do? To be clear, I'm not saying these features are
> necessarily *bad* ideas, in isolation -- just that the benefits aren't
> very convincing, and there are trade-offs, like:

I have often wished numpy had these short-circuiting gufuncs, for a very 
long time. I specifically remember my fruitless searches for how to do 
it back to 2007.

While "on average" short-circuiting only gives a speedup of 2x, in many 
situations you can arrange your algorithm so short circuiting will 
happen early, eg usually in the first 10 elements of a 10^6 element 
array, giving enormous speedups.

Also, I do not imagine these as free-floating ufuncs, I think we can 
arrange them in a logical way in a gufunc ecosystem. There would be some 
"core ufuncs", with "associated gufuncs" accessible as attributes. For 
instance, any_less_than will be accessible as less.any

binary "comparison" ufuncs would have attributes

less.any
less.all
less.first  # returns first matching index
less.count  # counts matches without intermediate bool array

This adds on to the existing attributes, for instance
ufuncs already have:

add.reduce
add.accumulate
add.reduceat
add.outer
add.at

It is unfortunate that all ufuncs currently have these attributes even 
if they are unimplemented/inappropriate (eg, np.sin.reduce), I would 
like to  remove the inappropriate ones, so each core ufunc will only 
have the appropriate attribute "associated gufuncs".

Incidentally, once we make reduce/accumuate/... into "associated 
gufuncs", I propose completely removing the "method" argument of 
__array_ufunc__, since it is no longer needed and adds a lot
of complexity which implementors of an __array_ufunc__ are forced to
account for.

Cheers,
Allan

> 
> - When it comes to the core ufunc machinery, we have a limited
> complexity budget. I'm nervous that if we add too many bells and
> whistles, we'll end up writing ourselves into a corner where we have
> trouble maintaining it, where it becomes difficult to predict how
> different features interact, it becomes increasingly difficult for
> third-parties to handle all the different features in their
> __array_ufunc__ methods...
> 
> - And, we have a lot of other demands on the core ufunc machinery,
> that might be better places to spend our limited complexity budget.
> For example, can we come up with an extension to make np.sort a
> gufunc? That seems like a much higher priority than figuring out how
> to make all_equal a gufunc. What about refactoring the ufunc machinery
> to support user-defined dtypes? That'll need some serious work, and
> again, it's probably higher priority than supporting cross-product or
> all_equal directly (or at least it seems that way to me).
> 
> Maybe there are more compelling use cases that I'm missing, but as it
> is, I feel like trying to add too many features to the current ufunc
> machinery is pretty risky for future maintainability, and we shouldn't
> do it without really solid use cases.
> 
> -n
>