[Numpy-discussion] NEP-18 comment

Frederic Bastien fbastien at nvidia.com
Thu Mar 7 09:23:10 EST 2019

I like your idea Sebastian. This way it is enabled only when needed and it is invisible to the user at the same time.

Stefan, does it solve well enough the potential problem you raised?

It still raise the problem that if you import a lib and do not use it, then the small array would slow down. I do not see way around that expect to tell the library dev to call that method not  on import, but when it get used. But this can be tricky to do for some lib I think.

I prefer a slowdown when you try to use advanced feature then a slow down by default and your proposition does that.


-----Original Message-----
From: NumPy-Discussion <numpy-discussion-bounces+fbastien=nvidia.com at python.org> On Behalf Of Sebastian Berg
Sent: Thursday, March 7, 2019 8:58 AM
To: numpy-discussion at python.org
Subject: Re: [Numpy-discussion] NEP-18 comment

On Wed, 2019-03-06 at 12:41 -0800, Stephan Hoyer wrote:
> On Wed, Mar 6, 2019 at 10:10 AM Frederic Bastien <fbastien at nvidia.com
> > wrote:
> > Hi,
> >  
> > I was told recently about the NEP-18. I like it, but I have a 
> > comment.
> >  
> > At first, it is enabled in a release by setting an environment 
> > variable.
> > Then in the following release, it is enabled by default.
> >  
> > Is it possible to allow for the second release to disable it by an 
> > environment variable? This would allow to disable it for people that 
> > would be inconvenienced by this change.
> >  
> > Who could be negatively impacted by this? Small array operation.
> >  
> > I recall many years ago of much effort to speed up small array, 
> > including at least one GSoC. I didn’t do timing, but I’m pretty sure 
> > the change needed for NEP-19 raise the default overhead for all 
> > operation that it impact.
> > I also recall people in the mailing list asking how to speed up the 
> > small array case.
> > So giving people a way to not pay for that overhead would make sure 
> > that for people that care about the small array cases won’t pay that 
> > extra overhead price. So they won’t see a slowdown by updating 
> > NumPy.
> >  
> > Probably less than 1% of user will enable the new functionality in 
> > the current release as it is not enabled by default. Making it easy 
> > to try is great, but doesn’t guaranty that it will be tested in all 
> > corner cases. So it isn’t sure we would heard before the second 
> > release about regression this change does.
> >  
> > What do you think about that small changes?
> > 
> Hi Frederic,
> Thanks for raising these concerns (and thanks for your work on 
> Theano!).
> I agree, this seems like an easy and worthwhile change. The NEP-18 
> implementation has been rewritten in C, so I expect that the typical 
> overhead will be quite minimal (about 1 us per function call), but I 
> agree that there may be some important edge cases that we missed.

If this is really an issue/worthwhile, how much overhead would be left if you skip the dispatching check and call the original implementation directly? It seems to me that may be quite a bit less than 1us (but maybe still too much)?

The reason I ask is that if we can get a large part of the way with a runtime (instead of import time) switch, such as:

np.enable_array_function_dispatch()  # likely more hidden than that.

than that seems much more useful to me, because dask and friends can just call it and users do not need to know about it. Otherwise, we add an optimization which would only be used by a tiny fraction of numpy users, or it would be used blindly and problems arise when someone starts using e.g. dask later.
And if we disable dispatching by default, dask would have to print a confusing "please set this environment variable" message.

- Sebastian

> The only downside I can think of is that libraries using NEP-18 won't 
> be able to simply rely upon the NumPy version for checking if it's 
> supported -- they will also have to check an environment variable.
> Best,
> Stephan
> >  
> > Thanks for the great work on such important library.
> >  
> > Frédéric Bastien
> > This email message is for the sole use of the intended recipient(s) 
> > and may contain confidential information.  Any unauthorized review, 
> > use, disclosure or distribution is prohibited.  If you are not the 
> > intended recipient, please contact the sender by reply email and 
> > destroy all copies of the original message.
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

More information about the NumPy-Discussion mailing list