Re: [Numpy-discussion] Compare NumPy arrays with threshold
![](https://secure.gravatar.com/avatar/7778a5f0155d76b197bdb25c8f103076.jpg?s=120&d=mm&r=g)
Hi again, Thanks for the responses to my question! Roberts answer worked very well for me, except for 1 small issue: This line: close_mask = np.isclose(MatA, MatB, Threshold, equal_nan=True) returns each difference twice - once j in compare to I and once for I in compare to j for example: for this input: MatA = [[10,20,30],[40,50,60]] MatB = [[10,30,30],[40,50,160]] My old code will return: 0,1,20,30 1,3,60,160 You code returns: 0,1,20,30 1,3,60,160 0,1,30,20 1,3,160,60 I can simply cut "close_mask" to half so I'll have only 1 iteration, but that does not seems to be efficient.. any ideas? Also, what should I change to support 3D arrays as well? Thanks again, Nissim. -----Original Message----- From: NumPy-Discussion [mailto:numpy-discussion-bounces+nissimd=elspec-ltd.com@python.org] On Behalf Of numpy-discussion-request@python.org Sent: Wednesday, May 17, 2017 8:17 PM To: numpy-discussion@python.org Subject: NumPy-Discussion Digest, Vol 128, Issue 18 Send NumPy-Discussion mailing list submissions to numpy-discussion@python.org<mailto:numpy-discussion@python.org> To subscribe or unsubscribe via the World Wide Web, visit https://mail.python.org/mailman/listinfo/numpy-discussion or, via email, send a message with subject or body 'help' to numpy-discussion-request@python.org<mailto:numpy-discussion-request@python.org> You can reach the person managing the list at numpy-discussion-owner@python.org<mailto:numpy-discussion-owner@python.org> When replying, please edit your Subject line so it is more specific than "Re: Contents of NumPy-Discussion digest..." Today's Topics: 1. Compare NumPy arrays with threshold and return the differences (Nissim Derdiger) 2. Re: Compare NumPy arrays with threshold and return the differences (Paul Hobson) 3. Re: Compare NumPy arrays with threshold and return the differences (Robert Kern) ---------------------------------------------------------------------- Message: 1 Date: Wed, 17 May 2017 16:50:40 +0000 From: Nissim Derdiger <NissimD@elspec-ltd.com<mailto:NissimD@elspec-ltd.com>> To: "numpy-discussion@python.org<mailto:numpy-discussion@python.org>" <numpy-discussion@python.org<mailto:numpy-discussion@python.org>> Subject: [Numpy-discussion] Compare NumPy arrays with threshold and return the differences Message-ID: <9EFE3345170EF24DB67C61C1B05EEEDB4073F384@EX10.Elspec.local<mailto:9EFE3345170EF24DB67C61C1B05EEEDB4073F384@EX10.Elspec.local>> Content-Type: text/plain; charset="us-ascii" Hi, In my script, I need to compare big NumPy arrays (2D or 3D), and return a list of all cells with difference bigger than a defined threshold. The compare itself can be done easily done with "allclose" function, like that: Threshold = 0.1 if (np.allclose(Arr1, Arr2, Threshold, equal_nan=True)): Print('Same') But this compare does not return which cells are not the same. The easiest (yet naive) way to know which cells are not the same is to use a simple for loops code like this one: def CheckWhichCellsAreNotEqualInArrays(Arr1,Arr2,Threshold): if not Arr1.shape == Arr2.shape: return ['Arrays size not the same'] Dimensions = Arr1.shape Diff = [] for i in range(Dimensions [0]): for j in range(Dimensions [1]): if not np.allclose(Arr1[i][j], Arr2[i][j], Threshold, equal_nan=True): Diff.append(',' + str(i) + ',' + str(j) + ',' + str(Arr1[i,j]) + ',' + str(Arr2[i,j]) + ',' + str(Threshold) + ',Fail\n') return Diff (and same for 3D arrays - with 1 more for loop) This way is very slow when the Arrays are big and full of none-equal cells. Is there a fast straight forward way in case they are not the same - to get a list of the uneven cells? maybe some built-in function in the NumPy itself? Thanks! Nissim
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
On Thu, May 18, 2017 at 5:07 AM, Nissim Derdiger <NissimD@elspec-ltd.com> wrote:
Hi again, Thanks for the responses to my question! Roberts answer worked very well for me, except for 1 small issue:
This line: close_mask = np.isclose(MatA, MatB, Threshold, equal_nan=True) returns each difference twice – once j in compare to I and once for I in
compare to j No, it returns a boolean array the same size as MatA and MatB. It literally can't contain "each difference twice". Maybe there is something else in your code that is producing the doubling that you see, possibly in the printing of the results. I'm not seeing the behavior that you speak of. Please post your complete code that produced the doubled output that you see. import numpy as np MatA = np.array([[10,20,30],[40,50,60]]) MatB = np.array([[10,30,30],[40,50,160]]) Threshold = 1.0 # Note the `atol=` here. I missed it before. close_mask = np.isclose(MatA, MatB, atol=Threshold, equal_nan=True) far_mask = ~close_mask i_idx, j_idx = np.nonzero(far_mask) for i, j in zip(i_idx, j_idx): print("{0}, {1}, {2}, {3}, {4}, Fail".format(i, j, MatA[i, j], MatB[i, j], Threshold)) I get the following output: $ python isclose.py 0, 1, 20, 30, 1.0, Fail 1, 2, 60, 160, 1.0, Fail -- Robert Kern
participants (2)
-
Nissim Derdiger
-
Robert Kern