Changes to generalized ufunc core dimension checking

Hi everyone, Can you help me understand why the stricter changes to generalized ufunc argument checking no now longer allows scalars to be interpreted as 1-d arrays in the core-dimensions? Is there a way to specify in the core-signature that scalars should be allowed and interpreted in those cases as an array with all the elements the same? This seems like an important feature. Here's an example: myfunc with core-signature (t),(k),(k) -> (t) called with myfunc(arr1, arr2, scalar2). This used to work in 1.9 and before and scalar2 was interpreted as a 1-d array the same size as arr2. It no longer works with 1.10.0 but I don't see why that is an improvement. Thoughts? Is there a work-around that doesn't involve creating a 1-d array the same size as arr2 and filling it with scalar2? Thanks. -Travis -- *Travis Oliphant, PhD* *Co-founder and CEO* @teoliphant 512-222-5440 http://www.continuum.io

Hi Travis, On Mar 16, 2016 9:52 AM, "Travis Oliphant" <travis@continuum.io> wrote:
Hi everyone,
Can you help me understand why the stricter changes to generalized ufunc
argument checking no now longer allows scalars to be interpreted as 1-d arrays in the core-dimensions?
Is there a way to specify in the core-signature that scalars should be
allowed and interpreted in those cases as an array with all the elements the same? This seems like an important feature. Can you share some example of when this is useful? The reasoning for the change was that broadcasting is really about aligning a set of core elements for parallel looping, and in the gufunc case with arbitrary core kernels that might or might not have any simple loop structure inside them, it's not at all obvious that it makes sense. (Of course we still use broadcasting to line up different instances of the core elements themselves, just not to manufacture the internal shape of the core elements.) In fact there are examples where it clearly doesn't make sense, I don't think we were able to come up with any compelling examples where it did make sense (which is one reason why I'm interested to hear what yours is :-)), and there's not a single obvious way to reconcile broadcasting rules and gufunc's named axis matching, so, when in doubt refuse the temptation to guess. (Example of when it doesn't make sense: matrix_multiply with (n,k),(k,m)->(n,m) used to produce all kinds of different counterintuitive behaviors if one or both of the inputs were scalar or 1d. In this case making it an error is a clear improvement IMHO. And for something like inv that takes a single input with signature (n,n), if you get (1,n) do you broadcast that to (n,n)? If not, why not? For regular broadcasting the question makes no sense but once you have named axis matching then suddenly it's not obvious.)
A better workaround would be to use one of the np.broadcast* functions to request exactly the broadcasting you want and make an arr2-sized view of the scalar. In this case where you presumably (?) want to allow the last two arguments to be broadcast against each other arbitrarily: arr2, arr3 = np.broadcast_arrays(arr2, scalar) myufunc(arr1, arr2, arr3) A little wordier than implicit broadcasting, but not as bad as manually creating an array, and like implicit broadcasting the memory overhead is O(1) instead of O(size). -n

On Wed, Mar 16, 2016 at 12:55 PM, Nathaniel Smith <njs@pobox.com> wrote:
Being able to implicitly broadcast scalars to arrays is the core-function of broadcasting. This is still very useful when you have a core-kernel an want to pass in a scalar for many of the arguments. It seems that at least in that case, automatic broadcasting should be allowed --- as it seems clear what is meant. While you can use the broadcast* features to get the same effect with the current code-base, this is not intuitive to a user who is used to having scalars interpreted as arrays in other NumPy operations. It used to automatically happen and a few people depended on it in several companies and so the 1.10 release broke their code. I can appreciate that in the general case, allowing arbitrary broadcasting on the internal core dimensions can create confusion. But, scalar broadcasting still makes sense. A better workaround would be to use one of the np.broadcast* functions to
Thanks for the pointer (after I wrote the email this solution also occured to me). I think adding back automatic broadcasting for the scalar case makes a lot of sense as well, however. What do people think of that? Also adding this example to the documentation as a work-around for people whose code breaks with the new changes. Thanks, -Travis
-- *Travis Oliphant, PhD* *Co-founder and CEO* @teoliphant 512-222-5440 http://www.continuum.io

On Wed, Mar 16, 2016 at 1:48 PM, Travis Oliphant <travis@continuum.io> wrote:
The `@` operator doesn't allow that.
Mixing array multiplications with scalar broadcasting is looking for trouble. Array multiplication needs strict dimensions and having stacked arrays and vectors was one of the prime objectives of gufuncs. Perhaps what we need is a more precise notation for broadcasting, maybe `*` or some such addition to the signaturs to indicate that scalar broadcasting is acceptable. <snip> Chuck

On Wed, Mar 16, 2016 at 3:07 PM, Charles R Harris <charlesr.harris@gmail.com
wrote:
I think that is a good idea. Let the user decide if scalar broadcasting is acceptable for their function. Here is a simple concrete example where scalar broadcasting makes sense: A 1-d dot product (the core of np.inner) (k), (k) -> () A user would assume they could call this function with a scalar in either argument and have it broadcast to a 1-d array. Of course, if both arguments are scalars, then it doesn't make sense. Having a way for the user to allow scalar broadcasting seems sensible and a nice compromise. -Travis
-- *Travis Oliphant, PhD* *Co-founder and CEO* @teoliphant 512-222-5440 http://www.continuum.io

Hi, Here is another example. To write pix2ang (and similar functions) to a ufunc, one may want to have implicit scalar broadcast on `nested` and `nsides` arguments. The function is described here: http://healpy.readthedocs.org/en/latest/generated/healpy.pixelfunc.pix2ang.h... Yu On Thu, Mar 17, 2016 at 2:04 AM, Travis Oliphant <travis@continuum.io> wrote:

On Mar 17, 2016 1:22 AM, "Feng Yu" <rainwoodman@gmail.com> wrote:
implicit scalar broadcast on `nested` and `nsides` arguments.
The function is described here:
http://healpy.readthedocs.org/en/latest/generated/healpy.pixelfunc.pix2ang.h... Sorry, can you elaborate on what that function does, maybe give an example, for those of us who haven't used healpy before? I can't quite understand from that page, but am interested... -n

Hi, ang2pix is used in astronomy to pixelize coordinate in forms of (theta, phi). healpy is a binding of healpix (http://healpix.sourceforge.net/, introduction there too), plus a lot of more extra features or bloat (and I am not particular fond of this aspect of healpy). It gets the work done. You can think of the function ang2pix as nump.digitize for angular input. 'nside' and 'nest' controls the number of pixels and the ordering of pixels (since it is 2d to linear index). The important thing here is ang2pix is a pure function from (nside, nest, theta, phi) to pixelid, so in principle it can be written as a ufunc to extend the functionality to generate pixel ids for different nside and nest settings in the same function call. There are probably functions in numpy that can benefit from this as well, but I can't immediately think of any. Yu On Thu, Mar 17, 2016 at 8:09 AM, Joseph Fox-Rabinovitz <jfoxrabinovitz@gmail.com> wrote:

On Thu, Mar 17, 2016 at 2:04 PM, Feng Yu <rainwoodman@gmail.com> wrote:
Thanks for the details! the question is about how to handle these "inner" dimensions. In this case it sounds like (nside, nest, theta, phi) are 4 scalars, right? So this would just be a regular ufunc, and the whole issue doesn't arise. Broadcast all you like :-) -n -- Nathaniel J. Smith -- https://vorpus.org

On Thu, Mar 17, 2016 at 1:04 AM, Travis Oliphant <travis@continuum.io> wrote:
To generalize a little bit, consider the entire family of weighted statistical function (mean, std, median, etc.). For example, the gufunc version of np.average is basically equivalent to np.inner with a bit of preprocessing. Arguably, it *could* make sense to broadcast weights when given a scalar: np.average(values, weights=1.0 / len(values)) is pretty unambiguous. That said, adding an explicit "scalar broadcasting OK flag" seems like a hack that will need even more special logic (e.g., so we can error if both arguments to np.inner are scalars). Multiple dispatch for gufunc core signatures seems like the cleaner solution. If you want np.inner to handle scalars, you need to supply core signatures (k),()->() and (),(k)->() along with (k),(k)->(). This is the similar to vision of three core signatures for np.matmul: (i),(i,j)->(j), (i,j),(j)->(i) and (i,j),(j,k)->(i,k). Maybe someone will even eventually get around to adding an axis/axes argument so we can specify these core dimensions explicitly. Writing np.inner(a, b, axes=((-1,), ())) could trigger the (k),()->() signature even if the second argument is not a scalar (it should be broadcast against "a" instead).

On Thu, Mar 17, 2016 at 4:41 PM, Stephan Hoyer <shoyer@gmail.com> wrote:
That's a great idea! Adding multiple-dispatch capability for this case could also solve a lot of issues that right now prevent generalized ufuncs from being the mechanism of implementation of *all* NumPy functions. -Travis
-- *Travis Oliphant, PhD* *Co-founder and CEO* @teoliphant 512-222-5440 http://www.continuum.io

On Thu, Mar 17, 2016 at 2:49 PM, Travis Oliphant <travis@continuum.io> wrote:
For future reference, there's already an issue on GitHub about adding an axis argument to gufuncs: https://github.com/numpy/numpy/issues/5197 (see also the referenced mailing list discussion from that page.)

On Thu, Mar 17, 2016 at 10:41 PM, Stephan Hoyer <shoyer@gmail.com> wrote:
Would the logic for such a thing be consistent? E.g. how do you decide if the user is requesting (k),(k)->(), or (k),()->() with broadcasting over a non-core dimension of size k in the second argument? What if your signatures are (m, k),(k)->(m) and (k),(n,k)->(n) and your two inputs are (m,k) and (n,k), how do you decide which one to call? Or alternatively, how do you detect and forbid such ambiguous designs? Figuring out the dispatch rules for the general case seems like a non-trivial problem to me. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.

On Thu, Mar 17, 2016 at 3:28 PM, Jaime Fernández del Río < jaime.frio@gmail.com> wrote:
I would require a priority order for the core signatures when the gufunc is created and only allow one implementation per argument dimension in the core signature (i.e., disallow multiple implementations like (k),(k)->() and (k),(m)->()). The rule would be to dispatch to the implementation with the first core signature with the right number of axes. The later constraint ensures that (m,n) @ (k,n) errors if k != n, rather than attempting vectorized matrix-vector multiplication. For matmul/@, the priority order is pretty straightforward: 1. (m,n),(n,k)->(n,k) 2. (m,n),(n)->(m) 3. (m),(m,n)->(n) 4. (m),(m)->() (2 and 3 could be safely interchanged.) For scenarios like "(k),(k)->(), or (k),()->()", the only reasonable choice would be to put (k),(k)->() first -- otherwise it never gets called. For the other ambiguous case, "(m, k),(k)->(m) and (k),(n,k)->(n)", the implementer would also need to pick an order. Most of the tricky cases for multiple dispatch arise from extensible systems (e.g., Matthew Rocklin's multipledispatch library), where you allow/encourage third party libraries to add their own implementations and need to be sure the combined result is still consistent. I wouldn't suggest such a system for NumPy -- I think it's fine to require every gufunc to have a single owner. There are other solutions for allowing extensibility to duck array types (namely, __numpy_ufunc__).

On Wed, Mar 16, 2016 at 12:48 PM, Travis Oliphant <travis@continuum.io> wrote:
It isn't at all obvious what matrix_multiply(some_arr, 3) should mean, and the behavior you're proposing is definitely not what most people would expect when seeing that call. (I guess most people reading this would expect it to be equivalent to do a scalar multiplication like some_arr * 3, but in numpy 1.9 and earlier it did... I'm not quite sure what, actually, maybe np.sum(some_arr * 3, axis=1)?) Again, can you share some example(s) of a gufunc where this is what is wanted, so that we have a more concrete basis for discussion?
That's certainly unfortunate -- I think one of the reasons we elected to push the change through directly instead of going through a deprecation cycle was that as far as we could tell, this feature was both broken and unused and was only noticed recently because gufunc usage was only just starting to increase -- so we wanted to get rid of it quickly before people started inadvertently depending on it. (And clear the ground for any more-carefully-thought-through option that might arise in the future.) Sounds like a real deprecation cycle would have been better. For reference: Mailing list discussion: http://thread.gmane.org/gmane.comp.python.numeric.general/58824 Pull request: https://github.com/numpy/numpy/pull/5077 -n -- Nathaniel J. Smith -- https://vorpus.org <http://vorpus.org>

On 03/16/2016 06:28 PM, Nathaniel Smith wrote:
... Sounds like a real deprecation cycle would have been better.
IMHO for a library as venerable and widely-used as Numpy, a deprecation cycle is almost always better ... consider this a lesson learned.
Also better if the "discussion" has a more descriptive subject than "Is this a bug?" ... My 2c. Steve

On Wed, Mar 16, 2016 at 3:45 PM, Steve Waterbury <waterbug@pangalactic.us> wrote:
Mandatory XKCD - https://xkcd.com/1172 We recently had a discussion about a similar "nobody we know uses nor should use this api" situation in IPython, and ultimately decided that xkcd 1172 would hit us, so opted in this case just for creating new cleaner APIs + possibly doing slow deprecation of the old stuff. For a widely used library, if the code exists then someone, somewhere depends on it, regardless of how broken or obscure you think the feature is. We just have to live with that. Cheers, f -- Fernando Perez (@fperez_org; http://fperez.org) fperez.net-at-gmail: mailing lists only (I ignore this when swamped!) fernando.perez-at-berkeley: contact me here for any direct mail

On Wed, Mar 16, 2016 at 7:32 PM, Fernando Perez <fperez.net@gmail.com> wrote:
Sure, the point is well taken, and we've been working hard to make numpy's change policy more consistent, rigorous, and useful -- and this remains very much a work in progress. But engineering is fundamentally the art of making trade-offs, which are always messy and context-dependent. I actually rather like XKCD 1172 because it's a Rorschach test -- you can just as easily read it as saying that you should start by accepting that all changes will break something, and then move on to the more interesting discussion of which things are you going to break and what are the trade-offs. :-) Like, in this case: Our general policy is definitely to use a deprecation cycle. And if you look at the discussions back in September, we also considered options like deprecating-then-removing the broadcasting, or adding a new gufunc-registration-API that would enable the new behavior while preserving the old behavior for old users. Both sound appealing initially. But they both also have serious downsides: they mean preserving broken behavior, possibly indefinitely, which *also* breaks users' code. Notice that if we add a new API in 1.10, then most people won't actually switch until 1.10 becomes the *minimum* version they support, except they probably will forget then too. And we have to maintain both APIs indefinitely. And I'm dubious that the kind of anonymous corporate users who got broken by this would have noticed a deprecation warning -- during the 1.11 cycle we had to go around pointing out some year+ old deprecation warnings to all our really active and engaged downstreams, never mind the ones we only hear about months later through Travis :-/. In this particular case, as far as we could tell, all single existing users were actually *broken* by the current behavior, so it was a question of whether we should fix a bunch of real cases but risk breaking some rare/theoretical ones, or should we leave a bunch of real cases broken for a while in order to be gentler to the rare/theoretical ones. And would we have ever even learned about these cases that Travis's clients are running into if we hadn't broken things? And so forth. It's obviously true that we make mistakes and should try to learn from them to do better in the future, but I object to the idea that there are simple and neat answers available :-). (And sometimes in my more pessimistic moments I feel like a lot of the conventional rules for change management are really technology for shifting around blame :-/. "We had a deprecation period that you didn't notice; your code broke the same either way, but our use of a deprecation period means that now it's *your* fault". Or, in the opposite situation: "Sure, this API doesn't work in the way that anyone would actually expect or want, but if we fix it then it will be *our* fault when existing code breaks -- OTOH if we leave the brokenness there but document it, then it'll be *your* fault when you -- totally predictably -- fall into the trap we've left for you". IMO in both situations it's much healthier to skip the blame games and take responsibility for the actual end result whatever it is, and if that means that some kind of failure is inevitable, then oh well, lets think about how to optimize our process for minimum net failure given imperfect information and finite resources :-). Is there some way we can help downstreams notice deprecations earlier? It's a lot easier to measure the cost of making a change than of not making a change -- is there some way we can gather more data to correct for this bias? IMO *these* are the really interesting questions, and they're ones that we've been actively working on.) -n -- Nathaniel J. Smith -- https://vorpus.org

On 03/16/2016 10:32 PM, Fernando Perez wrote:
Ha, I love that xkcd! But not sure I agree that it applies here ... however, I do appreciate your sharing it. :D I mean, just change stuff and see who screams, right? ;) Steve

On Thu, Mar 17, 2016 at 1:08 AM, Steve Waterbury <waterbug@pangalactic.us> wrote:
No, it's change stuff and listen to whether anybody screams. I'm sometimes late in (politely) screaming because deprecation warning are either filtered out or I'm using ancient numpy in my development python, or for whatever other reason I don't see warnings. Josef

Hi Travis, On Mar 16, 2016 9:52 AM, "Travis Oliphant" <travis@continuum.io> wrote:
Hi everyone,
Can you help me understand why the stricter changes to generalized ufunc
argument checking no now longer allows scalars to be interpreted as 1-d arrays in the core-dimensions?
Is there a way to specify in the core-signature that scalars should be
allowed and interpreted in those cases as an array with all the elements the same? This seems like an important feature. Can you share some example of when this is useful? The reasoning for the change was that broadcasting is really about aligning a set of core elements for parallel looping, and in the gufunc case with arbitrary core kernels that might or might not have any simple loop structure inside them, it's not at all obvious that it makes sense. (Of course we still use broadcasting to line up different instances of the core elements themselves, just not to manufacture the internal shape of the core elements.) In fact there are examples where it clearly doesn't make sense, I don't think we were able to come up with any compelling examples where it did make sense (which is one reason why I'm interested to hear what yours is :-)), and there's not a single obvious way to reconcile broadcasting rules and gufunc's named axis matching, so, when in doubt refuse the temptation to guess. (Example of when it doesn't make sense: matrix_multiply with (n,k),(k,m)->(n,m) used to produce all kinds of different counterintuitive behaviors if one or both of the inputs were scalar or 1d. In this case making it an error is a clear improvement IMHO. And for something like inv that takes a single input with signature (n,n), if you get (1,n) do you broadcast that to (n,n)? If not, why not? For regular broadcasting the question makes no sense but once you have named axis matching then suddenly it's not obvious.)
A better workaround would be to use one of the np.broadcast* functions to request exactly the broadcasting you want and make an arr2-sized view of the scalar. In this case where you presumably (?) want to allow the last two arguments to be broadcast against each other arbitrarily: arr2, arr3 = np.broadcast_arrays(arr2, scalar) myufunc(arr1, arr2, arr3) A little wordier than implicit broadcasting, but not as bad as manually creating an array, and like implicit broadcasting the memory overhead is O(1) instead of O(size). -n

On Wed, Mar 16, 2016 at 12:55 PM, Nathaniel Smith <njs@pobox.com> wrote:
Being able to implicitly broadcast scalars to arrays is the core-function of broadcasting. This is still very useful when you have a core-kernel an want to pass in a scalar for many of the arguments. It seems that at least in that case, automatic broadcasting should be allowed --- as it seems clear what is meant. While you can use the broadcast* features to get the same effect with the current code-base, this is not intuitive to a user who is used to having scalars interpreted as arrays in other NumPy operations. It used to automatically happen and a few people depended on it in several companies and so the 1.10 release broke their code. I can appreciate that in the general case, allowing arbitrary broadcasting on the internal core dimensions can create confusion. But, scalar broadcasting still makes sense. A better workaround would be to use one of the np.broadcast* functions to
Thanks for the pointer (after I wrote the email this solution also occured to me). I think adding back automatic broadcasting for the scalar case makes a lot of sense as well, however. What do people think of that? Also adding this example to the documentation as a work-around for people whose code breaks with the new changes. Thanks, -Travis
-- *Travis Oliphant, PhD* *Co-founder and CEO* @teoliphant 512-222-5440 http://www.continuum.io

On Wed, Mar 16, 2016 at 1:48 PM, Travis Oliphant <travis@continuum.io> wrote:
The `@` operator doesn't allow that.
Mixing array multiplications with scalar broadcasting is looking for trouble. Array multiplication needs strict dimensions and having stacked arrays and vectors was one of the prime objectives of gufuncs. Perhaps what we need is a more precise notation for broadcasting, maybe `*` or some such addition to the signaturs to indicate that scalar broadcasting is acceptable. <snip> Chuck

On Wed, Mar 16, 2016 at 3:07 PM, Charles R Harris <charlesr.harris@gmail.com
wrote:
I think that is a good idea. Let the user decide if scalar broadcasting is acceptable for their function. Here is a simple concrete example where scalar broadcasting makes sense: A 1-d dot product (the core of np.inner) (k), (k) -> () A user would assume they could call this function with a scalar in either argument and have it broadcast to a 1-d array. Of course, if both arguments are scalars, then it doesn't make sense. Having a way for the user to allow scalar broadcasting seems sensible and a nice compromise. -Travis
-- *Travis Oliphant, PhD* *Co-founder and CEO* @teoliphant 512-222-5440 http://www.continuum.io

Hi, Here is another example. To write pix2ang (and similar functions) to a ufunc, one may want to have implicit scalar broadcast on `nested` and `nsides` arguments. The function is described here: http://healpy.readthedocs.org/en/latest/generated/healpy.pixelfunc.pix2ang.h... Yu On Thu, Mar 17, 2016 at 2:04 AM, Travis Oliphant <travis@continuum.io> wrote:

On Mar 17, 2016 1:22 AM, "Feng Yu" <rainwoodman@gmail.com> wrote:
implicit scalar broadcast on `nested` and `nsides` arguments.
The function is described here:
http://healpy.readthedocs.org/en/latest/generated/healpy.pixelfunc.pix2ang.h... Sorry, can you elaborate on what that function does, maybe give an example, for those of us who haven't used healpy before? I can't quite understand from that page, but am interested... -n

Hi, ang2pix is used in astronomy to pixelize coordinate in forms of (theta, phi). healpy is a binding of healpix (http://healpix.sourceforge.net/, introduction there too), plus a lot of more extra features or bloat (and I am not particular fond of this aspect of healpy). It gets the work done. You can think of the function ang2pix as nump.digitize for angular input. 'nside' and 'nest' controls the number of pixels and the ordering of pixels (since it is 2d to linear index). The important thing here is ang2pix is a pure function from (nside, nest, theta, phi) to pixelid, so in principle it can be written as a ufunc to extend the functionality to generate pixel ids for different nside and nest settings in the same function call. There are probably functions in numpy that can benefit from this as well, but I can't immediately think of any. Yu On Thu, Mar 17, 2016 at 8:09 AM, Joseph Fox-Rabinovitz <jfoxrabinovitz@gmail.com> wrote:

On Thu, Mar 17, 2016 at 2:04 PM, Feng Yu <rainwoodman@gmail.com> wrote:
Thanks for the details! the question is about how to handle these "inner" dimensions. In this case it sounds like (nside, nest, theta, phi) are 4 scalars, right? So this would just be a regular ufunc, and the whole issue doesn't arise. Broadcast all you like :-) -n -- Nathaniel J. Smith -- https://vorpus.org

On Thu, Mar 17, 2016 at 1:04 AM, Travis Oliphant <travis@continuum.io> wrote:
To generalize a little bit, consider the entire family of weighted statistical function (mean, std, median, etc.). For example, the gufunc version of np.average is basically equivalent to np.inner with a bit of preprocessing. Arguably, it *could* make sense to broadcast weights when given a scalar: np.average(values, weights=1.0 / len(values)) is pretty unambiguous. That said, adding an explicit "scalar broadcasting OK flag" seems like a hack that will need even more special logic (e.g., so we can error if both arguments to np.inner are scalars). Multiple dispatch for gufunc core signatures seems like the cleaner solution. If you want np.inner to handle scalars, you need to supply core signatures (k),()->() and (),(k)->() along with (k),(k)->(). This is the similar to vision of three core signatures for np.matmul: (i),(i,j)->(j), (i,j),(j)->(i) and (i,j),(j,k)->(i,k). Maybe someone will even eventually get around to adding an axis/axes argument so we can specify these core dimensions explicitly. Writing np.inner(a, b, axes=((-1,), ())) could trigger the (k),()->() signature even if the second argument is not a scalar (it should be broadcast against "a" instead).

On Thu, Mar 17, 2016 at 4:41 PM, Stephan Hoyer <shoyer@gmail.com> wrote:
That's a great idea! Adding multiple-dispatch capability for this case could also solve a lot of issues that right now prevent generalized ufuncs from being the mechanism of implementation of *all* NumPy functions. -Travis
-- *Travis Oliphant, PhD* *Co-founder and CEO* @teoliphant 512-222-5440 http://www.continuum.io

On Thu, Mar 17, 2016 at 2:49 PM, Travis Oliphant <travis@continuum.io> wrote:
For future reference, there's already an issue on GitHub about adding an axis argument to gufuncs: https://github.com/numpy/numpy/issues/5197 (see also the referenced mailing list discussion from that page.)

On Thu, Mar 17, 2016 at 10:41 PM, Stephan Hoyer <shoyer@gmail.com> wrote:
Would the logic for such a thing be consistent? E.g. how do you decide if the user is requesting (k),(k)->(), or (k),()->() with broadcasting over a non-core dimension of size k in the second argument? What if your signatures are (m, k),(k)->(m) and (k),(n,k)->(n) and your two inputs are (m,k) and (n,k), how do you decide which one to call? Or alternatively, how do you detect and forbid such ambiguous designs? Figuring out the dispatch rules for the general case seems like a non-trivial problem to me. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.

On Thu, Mar 17, 2016 at 3:28 PM, Jaime Fernández del Río < jaime.frio@gmail.com> wrote:
I would require a priority order for the core signatures when the gufunc is created and only allow one implementation per argument dimension in the core signature (i.e., disallow multiple implementations like (k),(k)->() and (k),(m)->()). The rule would be to dispatch to the implementation with the first core signature with the right number of axes. The later constraint ensures that (m,n) @ (k,n) errors if k != n, rather than attempting vectorized matrix-vector multiplication. For matmul/@, the priority order is pretty straightforward: 1. (m,n),(n,k)->(n,k) 2. (m,n),(n)->(m) 3. (m),(m,n)->(n) 4. (m),(m)->() (2 and 3 could be safely interchanged.) For scenarios like "(k),(k)->(), or (k),()->()", the only reasonable choice would be to put (k),(k)->() first -- otherwise it never gets called. For the other ambiguous case, "(m, k),(k)->(m) and (k),(n,k)->(n)", the implementer would also need to pick an order. Most of the tricky cases for multiple dispatch arise from extensible systems (e.g., Matthew Rocklin's multipledispatch library), where you allow/encourage third party libraries to add their own implementations and need to be sure the combined result is still consistent. I wouldn't suggest such a system for NumPy -- I think it's fine to require every gufunc to have a single owner. There are other solutions for allowing extensibility to duck array types (namely, __numpy_ufunc__).

On Wed, Mar 16, 2016 at 12:48 PM, Travis Oliphant <travis@continuum.io> wrote:
It isn't at all obvious what matrix_multiply(some_arr, 3) should mean, and the behavior you're proposing is definitely not what most people would expect when seeing that call. (I guess most people reading this would expect it to be equivalent to do a scalar multiplication like some_arr * 3, but in numpy 1.9 and earlier it did... I'm not quite sure what, actually, maybe np.sum(some_arr * 3, axis=1)?) Again, can you share some example(s) of a gufunc where this is what is wanted, so that we have a more concrete basis for discussion?
That's certainly unfortunate -- I think one of the reasons we elected to push the change through directly instead of going through a deprecation cycle was that as far as we could tell, this feature was both broken and unused and was only noticed recently because gufunc usage was only just starting to increase -- so we wanted to get rid of it quickly before people started inadvertently depending on it. (And clear the ground for any more-carefully-thought-through option that might arise in the future.) Sounds like a real deprecation cycle would have been better. For reference: Mailing list discussion: http://thread.gmane.org/gmane.comp.python.numeric.general/58824 Pull request: https://github.com/numpy/numpy/pull/5077 -n -- Nathaniel J. Smith -- https://vorpus.org <http://vorpus.org>

On 03/16/2016 06:28 PM, Nathaniel Smith wrote:
... Sounds like a real deprecation cycle would have been better.
IMHO for a library as venerable and widely-used as Numpy, a deprecation cycle is almost always better ... consider this a lesson learned.
Also better if the "discussion" has a more descriptive subject than "Is this a bug?" ... My 2c. Steve

On Wed, Mar 16, 2016 at 3:45 PM, Steve Waterbury <waterbug@pangalactic.us> wrote:
Mandatory XKCD - https://xkcd.com/1172 We recently had a discussion about a similar "nobody we know uses nor should use this api" situation in IPython, and ultimately decided that xkcd 1172 would hit us, so opted in this case just for creating new cleaner APIs + possibly doing slow deprecation of the old stuff. For a widely used library, if the code exists then someone, somewhere depends on it, regardless of how broken or obscure you think the feature is. We just have to live with that. Cheers, f -- Fernando Perez (@fperez_org; http://fperez.org) fperez.net-at-gmail: mailing lists only (I ignore this when swamped!) fernando.perez-at-berkeley: contact me here for any direct mail

On Wed, Mar 16, 2016 at 7:32 PM, Fernando Perez <fperez.net@gmail.com> wrote:
Sure, the point is well taken, and we've been working hard to make numpy's change policy more consistent, rigorous, and useful -- and this remains very much a work in progress. But engineering is fundamentally the art of making trade-offs, which are always messy and context-dependent. I actually rather like XKCD 1172 because it's a Rorschach test -- you can just as easily read it as saying that you should start by accepting that all changes will break something, and then move on to the more interesting discussion of which things are you going to break and what are the trade-offs. :-) Like, in this case: Our general policy is definitely to use a deprecation cycle. And if you look at the discussions back in September, we also considered options like deprecating-then-removing the broadcasting, or adding a new gufunc-registration-API that would enable the new behavior while preserving the old behavior for old users. Both sound appealing initially. But they both also have serious downsides: they mean preserving broken behavior, possibly indefinitely, which *also* breaks users' code. Notice that if we add a new API in 1.10, then most people won't actually switch until 1.10 becomes the *minimum* version they support, except they probably will forget then too. And we have to maintain both APIs indefinitely. And I'm dubious that the kind of anonymous corporate users who got broken by this would have noticed a deprecation warning -- during the 1.11 cycle we had to go around pointing out some year+ old deprecation warnings to all our really active and engaged downstreams, never mind the ones we only hear about months later through Travis :-/. In this particular case, as far as we could tell, all single existing users were actually *broken* by the current behavior, so it was a question of whether we should fix a bunch of real cases but risk breaking some rare/theoretical ones, or should we leave a bunch of real cases broken for a while in order to be gentler to the rare/theoretical ones. And would we have ever even learned about these cases that Travis's clients are running into if we hadn't broken things? And so forth. It's obviously true that we make mistakes and should try to learn from them to do better in the future, but I object to the idea that there are simple and neat answers available :-). (And sometimes in my more pessimistic moments I feel like a lot of the conventional rules for change management are really technology for shifting around blame :-/. "We had a deprecation period that you didn't notice; your code broke the same either way, but our use of a deprecation period means that now it's *your* fault". Or, in the opposite situation: "Sure, this API doesn't work in the way that anyone would actually expect or want, but if we fix it then it will be *our* fault when existing code breaks -- OTOH if we leave the brokenness there but document it, then it'll be *your* fault when you -- totally predictably -- fall into the trap we've left for you". IMO in both situations it's much healthier to skip the blame games and take responsibility for the actual end result whatever it is, and if that means that some kind of failure is inevitable, then oh well, lets think about how to optimize our process for minimum net failure given imperfect information and finite resources :-). Is there some way we can help downstreams notice deprecations earlier? It's a lot easier to measure the cost of making a change than of not making a change -- is there some way we can gather more data to correct for this bias? IMO *these* are the really interesting questions, and they're ones that we've been actively working on.) -n -- Nathaniel J. Smith -- https://vorpus.org

On 03/16/2016 10:32 PM, Fernando Perez wrote:
Ha, I love that xkcd! But not sure I agree that it applies here ... however, I do appreciate your sharing it. :D I mean, just change stuff and see who screams, right? ;) Steve

On Thu, Mar 17, 2016 at 1:08 AM, Steve Waterbury <waterbug@pangalactic.us> wrote:
No, it's change stuff and listen to whether anybody screams. I'm sometimes late in (politely) screaming because deprecation warning are either filtered out or I'm using ancient numpy in my development python, or for whatever other reason I don't see warnings. Josef
participants (10)
-
Charles R Harris
-
Feng Yu
-
Fernando Perez
-
Jaime Fernández del Río
-
josef.pktd@gmail.com
-
Joseph Fox-Rabinovitz
-
Nathaniel Smith
-
Stephan Hoyer
-
Steve Waterbury
-
Travis Oliphant