Integers to negative integer powers, time for a decision.

Hi All, The time for NumPy 1.12.0 approaches and I like to have a final decision on the treatment of integers to negative integer powers with the `**` operator. The two alternatives looked to be *Raise an error for arrays and numpy scalars, including 1 and -1 to negative powers.* *Pluses* - Backward compatible - Allows common powers to be integer, e.g., arange(3)**2 - Consistent with inplace operators - Fixes current wrong behavior. - Preserves type *Minuses* - Integer overflow - Computational inconvenience - Inconsistent with Python integers *Always return a float * *Pluses* - Computational convenience *Minuses* - Loss of type - Possible backward incompatibilities - Not applicable to inplace operators Thoughts? Chuck

On Fri, Oct 7, 2016 at 9:12 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
2: +1 I'm still in favor of number 2: less buggy code and less mental gymnastics (watch out for that int, or which int do I need) (upcasting is not applicable for any inplace operators, AFAIU <int> *=0.5 ? zz = np.arange(5) zz**(-1) zz *= 0.5 tried in
Josef

On 10/7/2016 9:12 PM, Charles R Harris wrote:
Is the behavior of C++11 of any relevance to the choice? http://www.cplusplus.com/reference/cmath/pow/ Alan Isaac

Hi all, Just to have the options clear. Is the operator '**' going to be handled in any different manner than pow? Thanks. Armando

On Fri, Oct 7, 2016 at 6:12 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
I guess I could be wrong, but I think the backwards incompatibilities are going to be *way* too severe to make option 2 possible in practice. -n -- Nathaniel J. Smith -- https://vorpus.org

On Sat, Oct 8, 2016 at 4:40 AM, Nathaniel Smith <njs@pobox.com> wrote:
Backwards compatibility is also a major concern for me. Here are my current thoughts - Add an fpow ufunc that always converts to float, it would not accept object arrays. - Raise errors in current power ufunc (**), for ints to negative ints. The power ufunc will change in the following ways - +1, -1 to negative ints will error, currently they work - n > 1 ints to negative ints will error, currently warn and return zero - 0 to negative ints will error, they currently return the minimum integer The `**` operator currently calls the power ufunc, leave that as is for backward almost compatibility. The remaining question is numpy scalars, which we can make either compatible with Python, or with NumPy arrays. I'm leaning towards NumPy array compatibility mostly on account of type preservation and the close relationship between zero dimensionaly arrays and scalars. The fpow function could be backported to NumPy 1.11 if that would be helpful going forward. Chuck

On Sat, Oct 8, 2016 at 6:59 AM, Charles R Harris <charlesr.harris@gmail.com> wrote:
Maybe call it `fpower` or even `float_power`, for consistency with `power`?
Sounds good to me. I agree that we should prioritize within-numpy consistency over consistency with Python.
The fpow function could be backported to NumPy 1.11 if that would be helpful going forward.
I'm not a big fan of this kind of backport. Violating the "bug-fixes-only" rule makes it hard for people to understand our release versions. And it creates the situation where people can write code that they think requires numpy 1.11 (because it works with their numpy 1.11!), but then breaks on other people's computers (because those users have 1.11.(x-1)). And if there's some reason why people aren't willing to upgrade to 1.12 for new features, then probably better to spend energy addressing those instead of on putting together 1.11-and-a-half releases. -n -- Nathaniel J. Smith -- https://vorpus.org

On Sat, Oct 8, 2016 at 9:12 AM, Nathaniel Smith <njs@pobox.com> wrote:
The power ufunc is updated in https://github.com/numpy/numpy/pull/8127.

Sounds good to me. I agree that we should prioritize within-numpy consistency over consistency with Python.
I agree with that. Because of numpy consitetncy, the `**` operator should always return float. Right now the case is:
aa = np.arange(2, 10, dtype=int) array([2, 3, 4, 5, 6, 7, 8, 9])
bb = np.linspace(0, 7, 8, dtype=int) array([0, 1, 2, 3, 4, 5, 6, 7])
aa**-1 array([0, 0, 0, 0, 0, 0, 0, 0])
aa**-2 array([0, 0, 0, 0, 0, 0, 0, 0])
aa**(-bb) array([1, 0, 0, 0, 0, 0, 0, 0])
For me this behaviour is confusing. But I am not an expert just a user. I can live together with anything if I know what to expect. And I greatly appreciate the work of any developer for this excellent package.

On Sat, Oct 8, 2016 at 1:31 PM, Krisztián Horváth <raksi.raksi@gmail.com> wrote:
Can't do that and also return integers for positive powers. It isn't possible to have behavior completely compatible with python for arrays: can't have mixed type returns, can't have arbitrary precision integers. Chuck

Well, testing under windows 64 bit, Python 3.5.2, positive powers of integers give integers and negative powers of integers give floats. So, do you want to raise an exception when taking a negative power of an element of an array of integers? Because not doing so would be inconsistent with raising the exception when applying the same operation to the array. Clearly things are broken now (I get zeros when calculating negative powers of numpy arrays of integers others than 1), but that behavior was consistent with python itself under python 2.x because the division of two integers was an integer. That does not hold under Python 3.5 where the division of two integers is a float. You have offered either to raise an exception or to always return a float (i.e. even with positive exponents). You have never offered to be consistent with what Python does. This last option would be my favorite. If it cannot be implemented, then I would prefer always float. At least one would be consistent with something and we would not invent yet another convention. On 08.10.2016 21:36, Charles R Harris wrote:

On Sat, Oct 8, 2016 at 1:40 PM, V. Armando Sole <sole@esrf.fr> wrote:
Even on Python 2, negative powers gave floats:
Numpy tries to be consistent with Python when it makes sense, but this is only one of several considerations. The use cases for numpy objects are different from the use cases for Python scalar objects, so we also consistently deviate in cases when that makes sense -- e.g., numpy bools are very different from Python bools (Python barely distinguishes between bools and integers, because they don't need to; indexing makes the distinction much more important to numpy), numpy integers are very different from Python integers (Python's arbitrary-width integers provide great semantics, but don't play nicely with large fixed-size arrays), numpy pays much more attention to type consistency between inputs and outputs than Python does (again because of the extra constraints imposed by working with memory-intensive type-consistent arrays), etc. For python, 2 ** 2 -> int, 2 ** -2 -> float. But numpy can't do this, because then 2 ** np.array([2, -2]) would have to be both int *and* float, which it can't be. Not a problem that Python has. Or we could say that the output is int if all the inputs are positive, and float if any of them are negative... but then that violates the numpy principle that output dtypes should be determined entirely by input dtypes, without peeking at the actual values. (And this rule is very important for avoiding nasty surprises when you run your code on new inputs.) And then there's backwards compatibility to consider. As mentioned, we *could* deviate from Python by making ** always return float... but this would almost certainly break tons and tons of people's code that is currently doing integer ** positive integer and expecting to get an integer back. Which is something we don't do without very careful weighing of the trade-offs, and my intuition is that this one is so disruptive we probably can't pull it off. Breaking working code needs a *very* compelling reason. -n -- Nathaniel J. Smith -- https://vorpus.org

but then that violates the numpy
At division you get back an array of floats.
Why is it different, if you calculate the power of something?
This is a valid reasoning. But it could be solved with raising an exception to warn the users for the new behaviour.

On Sat, Oct 8, 2016 at 3:18 PM, Krisztián Horváth <raksi.raksi@gmail.com> wrote:
The difference is that Python division always returns float. Python int ** int sometimes returns int and sometimes returns float, depending on which particular integers are used. We can't be consistent with Python because Python isn't consistent with itself.
That is generally the best conservative strategy for making a backwards incompatible change like this: instead of going straight to the new behavior, first make it raise an error, and then once people have had time to stop depending on the old behavior, then you can add the new behavior. But in this case if we were going to make int ** int return float, this rule would mean that we have to make int ** int always raise an error for a few years, i.e. remove integer power support from numpy altogether. That's a non-starter. -n -- Nathaniel J. Smith -- https://vorpus.org

On Fr, 2016-10-07 at 19:12 -0600, Charles R Harris wrote:
For what its worth, I still feel it is probably the only real option to go with error, changing to float may have weird effects. Which does not mean it is impossible, I admit, though I would like some data on how downstream would handle it. Also would we need an int power? The fpower seems more straight forward/common pattern. If errors turned out annoying in some cases, a seterr might be plausible too (as well as a deprecation). - Sebastian

On Sun, Oct 9, 2016 at 6:25 AM, Sebastian Berg <sebastian@sipsolutions.net> wrote:
I agree with Sebastian and Nathaniel. I don't think we can deviating from the existing behavior (int ** int -> int) without breaking lots of existing code, and if we did, yes, we would need a new integer power function. I think it's better to preserve the existing behavior when it gives sensible results, and error when it doesn't. Adding another function float_power for the case that is currently broken seems like the right way to go.

On Fri, Oct 7, 2016 at 9:12 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
2: +1 I'm still in favor of number 2: less buggy code and less mental gymnastics (watch out for that int, or which int do I need) (upcasting is not applicable for any inplace operators, AFAIU <int> *=0.5 ? zz = np.arange(5) zz**(-1) zz *= 0.5 tried in
Josef

On 10/7/2016 9:12 PM, Charles R Harris wrote:
Is the behavior of C++11 of any relevance to the choice? http://www.cplusplus.com/reference/cmath/pow/ Alan Isaac

Hi all, Just to have the options clear. Is the operator '**' going to be handled in any different manner than pow? Thanks. Armando

On Fri, Oct 7, 2016 at 6:12 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
I guess I could be wrong, but I think the backwards incompatibilities are going to be *way* too severe to make option 2 possible in practice. -n -- Nathaniel J. Smith -- https://vorpus.org

On Sat, Oct 8, 2016 at 4:40 AM, Nathaniel Smith <njs@pobox.com> wrote:
Backwards compatibility is also a major concern for me. Here are my current thoughts - Add an fpow ufunc that always converts to float, it would not accept object arrays. - Raise errors in current power ufunc (**), for ints to negative ints. The power ufunc will change in the following ways - +1, -1 to negative ints will error, currently they work - n > 1 ints to negative ints will error, currently warn and return zero - 0 to negative ints will error, they currently return the minimum integer The `**` operator currently calls the power ufunc, leave that as is for backward almost compatibility. The remaining question is numpy scalars, which we can make either compatible with Python, or with NumPy arrays. I'm leaning towards NumPy array compatibility mostly on account of type preservation and the close relationship between zero dimensionaly arrays and scalars. The fpow function could be backported to NumPy 1.11 if that would be helpful going forward. Chuck

On Sat, Oct 8, 2016 at 6:59 AM, Charles R Harris <charlesr.harris@gmail.com> wrote:
Maybe call it `fpower` or even `float_power`, for consistency with `power`?
Sounds good to me. I agree that we should prioritize within-numpy consistency over consistency with Python.
The fpow function could be backported to NumPy 1.11 if that would be helpful going forward.
I'm not a big fan of this kind of backport. Violating the "bug-fixes-only" rule makes it hard for people to understand our release versions. And it creates the situation where people can write code that they think requires numpy 1.11 (because it works with their numpy 1.11!), but then breaks on other people's computers (because those users have 1.11.(x-1)). And if there's some reason why people aren't willing to upgrade to 1.12 for new features, then probably better to spend energy addressing those instead of on putting together 1.11-and-a-half releases. -n -- Nathaniel J. Smith -- https://vorpus.org

On Sat, Oct 8, 2016 at 9:12 AM, Nathaniel Smith <njs@pobox.com> wrote:
The power ufunc is updated in https://github.com/numpy/numpy/pull/8127.

Sounds good to me. I agree that we should prioritize within-numpy consistency over consistency with Python.
I agree with that. Because of numpy consitetncy, the `**` operator should always return float. Right now the case is:
aa = np.arange(2, 10, dtype=int) array([2, 3, 4, 5, 6, 7, 8, 9])
bb = np.linspace(0, 7, 8, dtype=int) array([0, 1, 2, 3, 4, 5, 6, 7])
aa**-1 array([0, 0, 0, 0, 0, 0, 0, 0])
aa**-2 array([0, 0, 0, 0, 0, 0, 0, 0])
aa**(-bb) array([1, 0, 0, 0, 0, 0, 0, 0])
For me this behaviour is confusing. But I am not an expert just a user. I can live together with anything if I know what to expect. And I greatly appreciate the work of any developer for this excellent package.

On Sat, Oct 8, 2016 at 1:31 PM, Krisztián Horváth <raksi.raksi@gmail.com> wrote:
Can't do that and also return integers for positive powers. It isn't possible to have behavior completely compatible with python for arrays: can't have mixed type returns, can't have arbitrary precision integers. Chuck

Well, testing under windows 64 bit, Python 3.5.2, positive powers of integers give integers and negative powers of integers give floats. So, do you want to raise an exception when taking a negative power of an element of an array of integers? Because not doing so would be inconsistent with raising the exception when applying the same operation to the array. Clearly things are broken now (I get zeros when calculating negative powers of numpy arrays of integers others than 1), but that behavior was consistent with python itself under python 2.x because the division of two integers was an integer. That does not hold under Python 3.5 where the division of two integers is a float. You have offered either to raise an exception or to always return a float (i.e. even with positive exponents). You have never offered to be consistent with what Python does. This last option would be my favorite. If it cannot be implemented, then I would prefer always float. At least one would be consistent with something and we would not invent yet another convention. On 08.10.2016 21:36, Charles R Harris wrote:

On Sat, Oct 8, 2016 at 1:40 PM, V. Armando Sole <sole@esrf.fr> wrote:
Even on Python 2, negative powers gave floats:
Numpy tries to be consistent with Python when it makes sense, but this is only one of several considerations. The use cases for numpy objects are different from the use cases for Python scalar objects, so we also consistently deviate in cases when that makes sense -- e.g., numpy bools are very different from Python bools (Python barely distinguishes between bools and integers, because they don't need to; indexing makes the distinction much more important to numpy), numpy integers are very different from Python integers (Python's arbitrary-width integers provide great semantics, but don't play nicely with large fixed-size arrays), numpy pays much more attention to type consistency between inputs and outputs than Python does (again because of the extra constraints imposed by working with memory-intensive type-consistent arrays), etc. For python, 2 ** 2 -> int, 2 ** -2 -> float. But numpy can't do this, because then 2 ** np.array([2, -2]) would have to be both int *and* float, which it can't be. Not a problem that Python has. Or we could say that the output is int if all the inputs are positive, and float if any of them are negative... but then that violates the numpy principle that output dtypes should be determined entirely by input dtypes, without peeking at the actual values. (And this rule is very important for avoiding nasty surprises when you run your code on new inputs.) And then there's backwards compatibility to consider. As mentioned, we *could* deviate from Python by making ** always return float... but this would almost certainly break tons and tons of people's code that is currently doing integer ** positive integer and expecting to get an integer back. Which is something we don't do without very careful weighing of the trade-offs, and my intuition is that this one is so disruptive we probably can't pull it off. Breaking working code needs a *very* compelling reason. -n -- Nathaniel J. Smith -- https://vorpus.org

but then that violates the numpy
At division you get back an array of floats.
Why is it different, if you calculate the power of something?
This is a valid reasoning. But it could be solved with raising an exception to warn the users for the new behaviour.

On Sat, Oct 8, 2016 at 3:18 PM, Krisztián Horváth <raksi.raksi@gmail.com> wrote:
The difference is that Python division always returns float. Python int ** int sometimes returns int and sometimes returns float, depending on which particular integers are used. We can't be consistent with Python because Python isn't consistent with itself.
That is generally the best conservative strategy for making a backwards incompatible change like this: instead of going straight to the new behavior, first make it raise an error, and then once people have had time to stop depending on the old behavior, then you can add the new behavior. But in this case if we were going to make int ** int return float, this rule would mean that we have to make int ** int always raise an error for a few years, i.e. remove integer power support from numpy altogether. That's a non-starter. -n -- Nathaniel J. Smith -- https://vorpus.org

On Fr, 2016-10-07 at 19:12 -0600, Charles R Harris wrote:
For what its worth, I still feel it is probably the only real option to go with error, changing to float may have weird effects. Which does not mean it is impossible, I admit, though I would like some data on how downstream would handle it. Also would we need an int power? The fpower seems more straight forward/common pattern. If errors turned out annoying in some cases, a seterr might be plausible too (as well as a deprecation). - Sebastian

On Sun, Oct 9, 2016 at 6:25 AM, Sebastian Berg <sebastian@sipsolutions.net> wrote:
I agree with Sebastian and Nathaniel. I don't think we can deviating from the existing behavior (int ** int -> int) without breaking lots of existing code, and if we did, yes, we would need a new integer power function. I think it's better to preserve the existing behavior when it gives sensible results, and error when it doesn't. Adding another function float_power for the case that is currently broken seems like the right way to go.
participants (10)
-
Alan Isaac
-
Charles R Harris
-
josef.pktd@gmail.com
-
Krisztián Horváth
-
Nathaniel Smith
-
Ralf Gommers
-
Ryan May
-
Sebastian Berg
-
Stephan Hoyer
-
V. Armando Sole