np.where and ZeroDivisionError: float division by zero
![](https://secure.gravatar.com/avatar/06777603d9c6e8c1d3a2aaab4e23959f.jpg?s=120&d=mm&r=g)
0 In my code, I use the following calculation for a column in the dataframe: np.where(df_score['number'] ! = 0, 100 - ((100 * df_score[rank_column] -50)/df_score['number']), None),I have used df_score['number']! = 0, but the code is still wrong, ZeroDivisionError: float division by zero, even if I put df_score['number']! = 0 changed to df_score['number'] > 0, why? pandas version:1.1.5 numpy version:1.24.4 Here are my numbers: 12.00000 12.00000 12.00000 12.00000 12.00000 0.00000 0.00000 0.00000 0.00000 0.00000 12.00000 12.00000 12.00000 I want to know why it went wrong and what should be done to fix it? Thank you for your help
![](https://secure.gravatar.com/avatar/8100933927a5dad12ba4489acea5cb9a.jpg?s=120&d=mm&r=g)
What you are hoping for here is known as "short circuit" or "lazy" evaluation. Namely, this would work if np.where(cond, x, y) only evaluates x if cond is true and only evaluates y if cond is false. In this case, not only can it handle situations where one of the true/false cases "breaks" one of the possible return values, it would also potentially save a lot of work if one of x of y is expensive to evaluate. This would clearly be the case if there were an explicit branch in the code, i.e. something of the form for row in score_df: if number[row] = 0: return None else: return 100 - ((100*rank_column[row] - 50)/number[row]) It COULD also be true for a vectorized operation IF it is implemented that way. However, numpy.where is NOT implemented like this. Its parameters are ALL numpy arrays themselves, and are each precomputed separately. Only then is the Boolean array that represents the outcomes of the conditional "combined" with the other two arrays x and y to produce the result. What you want is to do something along the lines of fancy indexing, where you evaluate the condition on the array to get an explicit Boolean mask, and then use this to select a "slice" (not a real contiguous slice, but a subset) of rows of the dataframe to pass to the expression that may break for zero values. On Thu, Apr 25, 2024 at 5:19 AM 840362492--- via NumPy-Discussion < numpy-discussion@python.org> wrote:
![](https://secure.gravatar.com/avatar/a2c1b891fe9dd5c60430e823bfe8c298.jpg?s=120&d=mm&r=g)
What you are hoping for here is known as "short circuit" or "lazy" evaluation.
In SciPy, we have the private utility function `_lazywhere`[1] for this. Cheers, Lucas [1] https://github.com/scipy/scipy/blob/f44326023dc51758495491fc9f06858fd38358a0...
![](https://secure.gravatar.com/avatar/e8689df47ecba3bb4a5b00c5871575c5.jpg?s=120&d=mm&r=g)
The function np.where just chooses elements from two arrays that are both computed before np.where is even executed. See this StackOverflow answer https://stackoverflow.com/a/29950752/4681187 if you want to suppress the error. On Thu, Apr 25, 2024 at 8:16 PM 840362492--- via NumPy-Discussion < numpy-discussion@python.org> wrote:
![](https://secure.gravatar.com/avatar/8100933927a5dad12ba4489acea5cb9a.jpg?s=120&d=mm&r=g)
What you are hoping for here is known as "short circuit" or "lazy" evaluation. Namely, this would work if np.where(cond, x, y) only evaluates x if cond is true and only evaluates y if cond is false. In this case, not only can it handle situations where one of the true/false cases "breaks" one of the possible return values, it would also potentially save a lot of work if one of x of y is expensive to evaluate. This would clearly be the case if there were an explicit branch in the code, i.e. something of the form for row in score_df: if number[row] = 0: return None else: return 100 - ((100*rank_column[row] - 50)/number[row]) It COULD also be true for a vectorized operation IF it is implemented that way. However, numpy.where is NOT implemented like this. Its parameters are ALL numpy arrays themselves, and are each precomputed separately. Only then is the Boolean array that represents the outcomes of the conditional "combined" with the other two arrays x and y to produce the result. What you want is to do something along the lines of fancy indexing, where you evaluate the condition on the array to get an explicit Boolean mask, and then use this to select a "slice" (not a real contiguous slice, but a subset) of rows of the dataframe to pass to the expression that may break for zero values. On Thu, Apr 25, 2024 at 5:19 AM 840362492--- via NumPy-Discussion < numpy-discussion@python.org> wrote:
![](https://secure.gravatar.com/avatar/a2c1b891fe9dd5c60430e823bfe8c298.jpg?s=120&d=mm&r=g)
What you are hoping for here is known as "short circuit" or "lazy" evaluation.
In SciPy, we have the private utility function `_lazywhere`[1] for this. Cheers, Lucas [1] https://github.com/scipy/scipy/blob/f44326023dc51758495491fc9f06858fd38358a0...
![](https://secure.gravatar.com/avatar/e8689df47ecba3bb4a5b00c5871575c5.jpg?s=120&d=mm&r=g)
The function np.where just chooses elements from two arrays that are both computed before np.where is even executed. See this StackOverflow answer https://stackoverflow.com/a/29950752/4681187 if you want to suppress the error. On Thu, Apr 25, 2024 at 8:16 PM 840362492--- via NumPy-Discussion < numpy-discussion@python.org> wrote:
participants (4)
-
840362492@qq.com
-
Fang Zhang
-
Lucas Colley
-
rosko37