Hi Nathaniel, 

I appreciate the clarification.  Thank you for that.  For what it's worth, I think that you may overestimate my involvement in the writing of that NEP.  I sat down with Stephan during a Numpy dev meeting and we hacked something together.  Afterwards several other people poured their thoughts into the process.  I'd like to think that my perspective helped to inform this NEP, but it wasn't, by far, the driving force.  If anyone had a strongest hand in the writing process it would probably be Stephan, who I find generally has a more conservative and careful perspective than I do.

That being said, I do think that Numpy would be wise to move quickly here.  I think that the growing fragmentation that we see in array computing in Numpy (Tensorflow, Torch, Dask, Sparse, CuPy) is largely due to Numpy moving slowly in the past.  There is, I think, a systemic problem slowly erupting now that I think the community to respond to quickly if it is possible to do so safely.  I believe that Numpy should absolutely be willing to try something experimental, and then say "nope, that was a bad idea" and retract it if it doesn't work out well.  I think that figuring out all of __array_concatenate__, __array_stack__, __array_foo__, etc. for each of the many cases will take too long to respond to in an effective timeframe.  I believe that we simply don't move quickly enough that this piece-by-piece careful handling of the API will result in Numpy's API becoming a meaningful standard in the broader community in the near-future.

That being said, I think that we should engage in this piece-by-piece discussion, and as we figure them out we should slowly encroach on __array_function__ and remove functionality from it, much as __array_ufunc__ is not included in it in the current NEP.  Ideally we should get to exactly where you want to get to.  I perceive the __array_function__ protocol as a sort of necessary stop-gap.

All that being said, this is just my personal stance.  I suspect that each of the authors of the NEP and others who engaged in its careful review have a different perspective, which should probably carry more weight than my own.


On Mon, Aug 13, 2018 at 4:29 PM Nathaniel Smith <njs@pobox.com> wrote:
On Mon, Aug 13, 2018 at 2:44 AM, Nathaniel Smith <njs@pobox.com> wrote:
> So this is like... an extreme version of technical debt. You're making
> a deal with the devil for wealth and fame, and then eventually the
> bill becomes due. It's hard for me to say categorically that this is a
> bad idea – empirically, it can be very successful! But there are real
> trade-offs. And it makes me a bit nervous that Matt is the one
> proposing this, because I'm pretty sure if you asked him he'd say he's
> absolutely focused on how to get something working ASAP and has no
> plans to maintain numpy in the future.

Rereading this today I realized that it could come across like I have
an issue with Matt specifically. I apologize to anyone who got that
impression (esp. Matt!) -- that definitely wasn't my intent. Matt is
awesome. I should stop writing these things at 2 am.

What I should have said is:

We have an unusual decision to make here, where there are two
plausible approaches that both have significant upsides and downsides,
and whose effects are going to be distributed in a complicated way
across different parts of our community over time. So the big
challenge is to figure out how to take all that into account and weigh
the needs of different stakeholders against each other.

One major argument for the __array_function__ approach is that it has
an actual NEP, which happened because we have a contributor who took
the lead on making it happen, and who's deeply involved in some of the
target projects like dask and sparse, so can make sure that the
proposal will work well for them. That's a huge advantage! But... it
also makes me a *little* nervous, because when you have really
talented and productive contributors like this it's easy to get swept
up in their perspective. So I want to double-check that we're also
thinking about the stakeholders who can't be as active in the
discussion, like "numpy maintainers from the future".

(And I mostly mean this as a "we should keep this in mind" kind of
thing – like I said in my original post, I think moving forward
implementing __array_function__ is a great idea; I just want to be
cautious about getting more experience before committing.)


Nathaniel J. Smith -- https://vorpus.org
NumPy-Discussion mailing list