Re: [Numpy-discussion] Forcing gufunc to error with size zero input

29 Sep 2019

      On 9/28/19, Eric Wieser  wrote:
...
Can you just raise an exception in the gufuncs inner loop? Or is there no
mechanism to do that today?
Maybe?  I don't know what is the idiomatic way to handle errors
detected in an inner loop.  And pushing this particular error
detection into the inner loop doesn't feel right.
...
I don't think you were proposing that core dimensions should _never_ be
allowed to be 0,
No, I'm not suggesting that.  There are many cases where a length 0
core dimension is fine.

I'm interested in the case where there is not a meaningful definition
of the operation on the empty set.  The mean is an example.  Currently
`np.mean([])` generates two warnings (one useful, the other cryptic
and apparently incidental), and returns nan.  Returning nan is one way
to handle such a case; another is to raise an error like `np.amax([])`
does.  I'd like to raise an error in the example that I'm working on
('peaktopeak' at https://github.com/WarrenWeckesser/npuff).  The
function is a gufunc, not a reduction of a binary operation, so the
'identity' argument  of PyUFunc_FromFuncAndDataAndSignature has no
effect.
...
but if you were I disagree. I spent a fair amount of work
enabling that for linalg because it provided some convenient base cases.
We could go down the route of augmenting the gufuncs signature syntax to
support requiring non-empty dimensions, like we did for optional ones -
although IMO we should consider switching from a string minilanguage to a
structured object specification if we plan to go too much further with
extending it.
After only a quick glance at that code: one option is to add a '+'
after the input names in the signature that must have a length that is
at least 1.  So the signature for functions like `mean` (if you were
to reimplement it as a gufunc, and wanted an error instead of nan),
`amax`, `ptp`, etc, would be '(i+)->()'.

However, the only meaningful uses-cases of this enhancement that I've
come up with are these simple reductions.  So I don't know if making
such a change to the signature is worthwhile.  On the other hand,
there are many examples of useful 1-d reductions that aren't the
reduction of an associative binary operation.  It might be worthwhile
to have a new convenience function just for the case '(i)->()', maybe
something like PyUFunc_OneDReduction_FromFuncAndData (ugh, that's
ugly, but I think you get the idea), and that function can have an
argument to specify that the length must be at least 1.

I'll see if that is feasible, but I won't be surprised to learn that
there are good reasons for *not* doing that.

Warren
...
On Sat, Sep 28, 2019, 17:47 Warren Weckesser 
wrote:
...
I'm experimenting with gufuncs, and I just created a simple one with
signature '(i)->()'.  Is there a way to configure the gufunc itself so
that an empty array results in an error?  Or would I have to create a
Python wrapper around the gufunc that does the error checking?
Currently, when passed an empty array, the ufunc loop is called with
the core dimension associated with i set to 0.  It would be nice if
the code didn't get that far, and the ufunc machinery "knew" that this
gufunc didn't accept a core dimension that is 0.  I'd like to
automatically get an error, something like the error produced by
`np.max([])`.
Warren
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion