Re: [Numpydiscussion] Compare NumPy arrays with threshold
Hi again, Thanks for the responses to my question! Roberts answer worked very well for me, except for 1 small issue: This line: close_mask = np.isclose(MatA, MatB, Threshold, equal_nan=True) returns each difference twice  once j in compare to I and once for I in compare to j for example: for this input: MatA = [[10,20,30],[40,50,60]] MatB = [[10,30,30],[40,50,160]] My old code will return: 0,1,20,30 1,3,60,160 You code returns: 0,1,20,30 1,3,60,160 0,1,30,20 1,3,160,60 I can simply cut "close_mask" to half so I'll have only 1 iteration, but that does not seems to be efficient.. any ideas? Also, what should I change to support 3D arrays as well? Thanks again, Nissim. Original Message From: NumPyDiscussion [mailto:numpydiscussionbounces+nissimd=elspecltd.com@python.org] On Behalf Of numpydiscussionrequest@python.org Sent: Wednesday, May 17, 2017 8:17 PM To: numpydiscussion@python.org Subject: NumPyDiscussion Digest, Vol 128, Issue 18 Send NumPyDiscussion mailing list submissions to numpydiscussion@python.org<mailto:numpydiscussion@python.org> To subscribe or unsubscribe via the World Wide Web, visit https://mail.python.org/mailman/listinfo/numpydiscussion or, via email, send a message with subject or body 'help' to numpydiscussionrequest@python.org<mailto:numpydiscussionrequest@python.org> You can reach the person managing the list at numpydiscussionowner@python.org<mailto:numpydiscussionowner@python.org> When replying, please edit your Subject line so it is more specific than "Re: Contents of NumPyDiscussion digest..." Today's Topics: 1. Compare NumPy arrays with threshold and return the differences (Nissim Derdiger) 2. Re: Compare NumPy arrays with threshold and return the differences (Paul Hobson) 3. Re: Compare NumPy arrays with threshold and return the differences (Robert Kern)  Message: 1 Date: Wed, 17 May 2017 16:50:40 +0000 From: Nissim Derdiger <NissimD@elspecltd.com<mailto:NissimD@elspecltd.com>> To: "numpydiscussion@python.org<mailto:numpydiscussion@python.org>" <numpydiscussion@python.org<mailto:numpydiscussion@python.org>> Subject: [Numpydiscussion] Compare NumPy arrays with threshold and return the differences MessageID: <9EFE3345170EF24DB67C61C1B05EEEDB4073F384@EX10.Elspec.local<mailto:9EFE3345170EF24DB67C61C1B05EEEDB4073F384@EX10.Elspec.local>> ContentType: text/plain; charset="usascii" Hi, In my script, I need to compare big NumPy arrays (2D or 3D), and return a list of all cells with difference bigger than a defined threshold. The compare itself can be done easily done with "allclose" function, like that: Threshold = 0.1 if (np.allclose(Arr1, Arr2, Threshold, equal_nan=True)): Print('Same') But this compare does not return which cells are not the same. The easiest (yet naive) way to know which cells are not the same is to use a simple for loops code like this one: def CheckWhichCellsAreNotEqualInArrays(Arr1,Arr2,Threshold): if not Arr1.shape == Arr2.shape: return ['Arrays size not the same'] Dimensions = Arr1.shape Diff = [] for i in range(Dimensions [0]): for j in range(Dimensions [1]): if not np.allclose(Arr1[i][j], Arr2[i][j], Threshold, equal_nan=True): Diff.append(',' + str(i) + ',' + str(j) + ',' + str(Arr1[i,j]) + ',' + str(Arr2[i,j]) + ',' + str(Threshold) + ',Fail\n') return Diff (and same for 3D arrays  with 1 more for loop) This way is very slow when the Arrays are big and full of noneequal cells. Is there a fast straight forward way in case they are not the same  to get a list of the uneven cells? maybe some builtin function in the NumPy itself? Thanks! Nissim
On Thu, May 18, 2017 at 5:07 AM, Nissim Derdiger <NissimD@elspecltd.com> wrote:
Hi again, Thanks for the responses to my question! Roberts answer worked very well for me, except for 1 small issue:
This line: close_mask = np.isclose(MatA, MatB, Threshold, equal_nan=True) returns each difference twice – once j in compare to I and once for I in
compare to j No, it returns a boolean array the same size as MatA and MatB. It literally can't contain "each difference twice". Maybe there is something else in your code that is producing the doubling that you see, possibly in the printing of the results. I'm not seeing the behavior that you speak of. Please post your complete code that produced the doubled output that you see. import numpy as np MatA = np.array([[10,20,30],[40,50,60]]) MatB = np.array([[10,30,30],[40,50,160]]) Threshold = 1.0 # Note the `atol=` here. I missed it before. close_mask = np.isclose(MatA, MatB, atol=Threshold, equal_nan=True) far_mask = ~close_mask i_idx, j_idx = np.nonzero(far_mask) for i, j in zip(i_idx, j_idx): print("{0}, {1}, {2}, {3}, {4}, Fail".format(i, j, MatA[i, j], MatB[i, j], Threshold)) I get the following output: $ python isclose.py 0, 1, 20, 30, 1.0, Fail 1, 2, 60, 160, 1.0, Fail  Robert Kern
participants (2)

Nissim Derdiger

Robert Kern