[Numpy-discussion] Proposal to accept NEP-18, __array_function__ protocol

Wed Aug 29 04:37:47 EDT 2018

On Fri, Aug 24, 2018 at 4:00 PM, Stephan Hoyer <shoyer at gmail.com> wrote:
> On Fri, Aug 24, 2018 at 3:14 PM Nathaniel Smith <njs at pobox.com> wrote:
>>
>> Yeah, the reason warnings are normally recommended is because
>> normally, you want to make it easy to silence. But this is the rare
>> case where I didn't want to make it easy to silence, so I didn't
>> suggest using a warning :-).
>>
>> Calling warnings.warn (or the C equivalent) is also very expensive,
>> even if the warning ultimately isn't displayed. I guess we could do
>> our own tracking of whether we've displayed the warning yet, and only
>> even attempt to issue it once, but that partially defeats the purpose
>> of using warnings in the first place.
>
>
> I thought the suggestion was to issue a warning when
> np.enable_experimental_array_function() is called. I agree that it's a
> non-starter to issue it every time an __array_function__ method is called --
> warnings are way too slow for that.

If our protection against uninformed usage is a Big Obnoxious
Warning(tm), then I was imagining that we could simplify by dropping
enable_experimental_array_function entirely. Doesn't make a big
difference either way though.

> People can redirect stderr, so we're really not stopping anyone from
> silencing things by doing it in a non-standard way. We're just making it
> annoying and non-standard. Developers could even run Python in a subprocess
> and filter out all the warnings -- there's really nothing we can do to stop
> determined abusers of this feature.
>
> I get that you want to make this annoying and non-standard, but this is too
> extreme for me. Do you seriously imagine that we'll consider ourselves
> beholden in the future to users who didn't take us at our word?

Let's break that question down into two parts:

1. if we do find ourselves in a situation where changing this would
break lots of users, will we consider ourselves beholden to them?
2. is it plausible that we'll find ourselves in that situation?

For the first question, I think the answer is... yes? We constantly
bend over backwards to try to avoid breaking users. Our deprecation
policy explicitly says that it doesn't matter what we say in the docs,
the only thing that matters is whether a change breaks users. And to
make things more complicated, it's easy to imagine scenarios where the
people being broken aren't the ones who had a chance to read the docs
– e.g. if a major package starts relying on __array_function__, then
it's all *their* users who we'd be breaking, even though they had
nothing to do with it. If any of
{tensorflow/astropy/dask/sparse/sklearn/...} did start relying on
__array_function__ for normal functionality, then *of course* that
would come up in future discussions about changing __array_function__,
and *of course* it would make us reluctant to do that. As it should,
because breaking users is bad, we should try to avoid ending up in
situations where that's what we have to do, even if we have a NEP to
point to to justify it.

But... maybe it's fine anyway, because this situation will never come
up? Obviously I hope that our downstreams are all conscientious, and
friendly, and take good care of their users, and would never create a
situation like that. I'm sure XArray won't :-). But... people are
busy, and distracted, and have pressures to get something shipped, and
corners get cut. Companies *care* about what's right, but they mostly
only *do* the minimum they have to. (Ask anyone who's tried to get
funding for OSS...) Academics *care* about what's right, but they just
don't have time to care much. So yeah... if there's a quick way to
shut up the warning and make things work (or seem to work,
temporarily), versus doing things right by talking to us, then I do
think people might take the quick hack.

The official Tensorflow wheels flat out lie about being manylinux
compatible, and the Tensorflow team has never talked to anyone about
how to fix this, they just upload them to PyPI and leave others get to
deal with the fallout [1]. That may well be what's best for their
users, I don't know. But stuff like this is normal, it happens all the
time, and if someone does it with __array_function__ then we have no
leverage.

-n

[1] https://github.com/tensorflow/tensorflow/issues/8802#issuecomment-401703703

-- 
Nathaniel J. Smith -- https://vorpus.org