[Numpy-discussion] Forcing gufunc to error with size zero input

Warren Weckesser warren.weckesser at gmail.com
Sun Sep 29 00:40:50 EDT 2019


On 9/29/19, Warren Weckesser <warren.weckesser at gmail.com> wrote:
> On 9/28/19, Eric Wieser <wieser.eric+numpy at gmail.com> wrote:
>> Can you just raise an exception in the gufuncs inner loop? Or is there no
>> mechanism to do that today?
>
> Maybe?  I don't know what is the idiomatic way to handle errors
> detected in an inner loop.  And pushing this particular error
> detection into the inner loop doesn't feel right.
>
>
>>
>> I don't think you were proposing that core dimensions should _never_ be
>> allowed to be 0,
>
>
> No, I'm not suggesting that.  There are many cases where a length 0
> core dimension is fine.
>
> I'm interested in the case where there is not a meaningful definition
> of the operation on the empty set.  The mean is an example.  Currently
> `np.mean([])` generates two warnings (one useful, the other cryptic
> and apparently incidental), and returns nan.  Returning nan is one way
> to handle such a case; another is to raise an error like `np.amax([])`
> does.  I'd like to raise an error in the example that I'm working on
> ('peaktopeak' at https://github.com/WarrenWeckesser/npuff).  The
> function is a gufunc, not a reduction of a binary operation, so the
> 'identity' argument  of PyUFunc_FromFuncAndDataAndSignature has no
> effect.
>
>> but if you were I disagree. I spent a fair amount of work
>> enabling that for linalg because it provided some convenient base cases.
>>
>> We could go down the route of augmenting the gufuncs signature syntax to
>> support requiring non-empty dimensions, like we did for optional ones -
>> although IMO we should consider switching from a string minilanguage to a
>> structured object specification if we plan to go too much further with
>> extending it.
>
> After only a quick glance at that code: one option is to add a '+'
> after the input names in the signature that must have a length that is
> at least 1.  So the signature for functions like `mean` (if you were
> to reimplement it as a gufunc, and wanted an error instead of nan),
> `amax`, `ptp`, etc, would be '(i+)->()'.
>
> However, the only meaningful uses-cases of this enhancement that I've
> come up with are these simple reductions.


Of course, just minutes after sending the email, I realized I *do*
know of other signatures that could benefit from a check on the core
dimension size.  An implementation of Pearson's correlation
coefficient as a gufunc would have signature (i),(i)->(), and the core
dimension i must be at least *2* for the calculation to be well
defined.  Other correlations would also likely require a nonzero core
dimension.

Warren



>  So I don't know if making
> such a change to the signature is worthwhile.  On the other hand,
> there are many examples of useful 1-d reductions that aren't the
> reduction of an associative binary operation.  It might be worthwhile
> to have a new convenience function just for the case '(i)->()', maybe
> something like PyUFunc_OneDReduction_FromFuncAndData (ugh, that's
> ugly, but I think you get the idea), and that function can have an
> argument to specify that the length must be at least 1.
>
> I'll see if that is feasible, but I won't be surprised to learn that
> there are good reasons for *not* doing that.
>
> Warren
>
>
>
>>
>> On Sat, Sep 28, 2019, 17:47 Warren Weckesser <warren.weckesser at gmail.com>
>> wrote:
>>
>>> I'm experimenting with gufuncs, and I just created a simple one with
>>> signature '(i)->()'.  Is there a way to configure the gufunc itself so
>>> that an empty array results in an error?  Or would I have to create a
>>> Python wrapper around the gufunc that does the error checking?
>>> Currently, when passed an empty array, the ufunc loop is called with
>>> the core dimension associated with i set to 0.  It would be nice if
>>> the code didn't get that far, and the ufunc machinery "knew" that this
>>> gufunc didn't accept a core dimension that is 0.  I'd like to
>>> automatically get an error, something like the error produced by
>>> `np.max([])`.
>>>
>>> Warren
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>
>


More information about the NumPy-Discussion mailing list