[Numpy-discussion] Forcing gufunc to error with size zero input

Sun Sep 29 00:20:03 EDT 2019

On 9/28/19, Eric Wieser <wieser.eric+numpy at gmail.com> wrote:
> Can you just raise an exception in the gufuncs inner loop? Or is there no
> mechanism to do that today?

Maybe?  I don't know what is the idiomatic way to handle errors
detected in an inner loop.  And pushing this particular error
detection into the inner loop doesn't feel right.

>
> I don't think you were proposing that core dimensions should _never_ be
> allowed to be 0,

No, I'm not suggesting that.  There are many cases where a length 0
core dimension is fine.

I'm interested in the case where there is not a meaningful definition
of the operation on the empty set.  The mean is an example.  Currently
`np.mean([])` generates two warnings (one useful, the other cryptic
and apparently incidental), and returns nan.  Returning nan is one way
to handle such a case; another is to raise an error like `np.amax([])`
does.  I'd like to raise an error in the example that I'm working on
('peaktopeak' at https://github.com/WarrenWeckesser/npuff).  The
function is a gufunc, not a reduction of a binary operation, so the
'identity' argument  of PyUFunc_FromFuncAndDataAndSignature has no
effect.

> but if you were I disagree. I spent a fair amount of work
> enabling that for linalg because it provided some convenient base cases.
>
> We could go down the route of augmenting the gufuncs signature syntax to
> support requiring non-empty dimensions, like we did for optional ones -
> although IMO we should consider switching from a string minilanguage to a
> structured object specification if we plan to go too much further with
> extending it.

After only a quick glance at that code: one option is to add a '+'
after the input names in the signature that must have a length that is
at least 1.  So the signature for functions like `mean` (if you were
to reimplement it as a gufunc, and wanted an error instead of nan),
`amax`, `ptp`, etc, would be '(i+)->()'.

However, the only meaningful uses-cases of this enhancement that I've
come up with are these simple reductions.  So I don't know if making
such a change to the signature is worthwhile.  On the other hand,
there are many examples of useful 1-d reductions that aren't the
reduction of an associative binary operation.  It might be worthwhile
to have a new convenience function just for the case '(i)->()', maybe
something like PyUFunc_OneDReduction_FromFuncAndData (ugh, that's
ugly, but I think you get the idea), and that function can have an
argument to specify that the length must be at least 1.

I'll see if that is feasible, but I won't be surprised to learn that
there are good reasons for *not* doing that.

Warren

>
> On Sat, Sep 28, 2019, 17:47 Warren Weckesser <warren.weckesser at gmail.com>
> wrote:
>
>> I'm experimenting with gufuncs, and I just created a simple one with
>> signature '(i)->()'.  Is there a way to configure the gufunc itself so
>> that an empty array results in an error?  Or would I have to create a
>> Python wrapper around the gufunc that does the error checking?
>> Currently, when passed an empty array, the ufunc loop is called with
>> the core dimension associated with i set to 0.  It would be nice if
>> the code didn't get that far, and the ufunc machinery "knew" that this
>> gufunc didn't accept a core dimension that is 0.  I'd like to
>> automatically get an error, something like the error produced by
>> `np.max([])`.
>>
>> Warren
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>