Hi again,
Thanks for the responses to my question!
Roberts answer worked very well for me, except for 1 small issue:
This line:
close_mask = np.isclose(MatA, MatB, Threshold, equal_nan=True)
returns each difference twice – once j in compare to I and once for I in compare to j
for example:
for this input:
MatA = [[10,20,30],[40,50,60]]
MatB = [[10,30,30],[40,50,160]]
My old code will return:
0,1,20,30
1,3,60,160
You code returns:
0,1,20,30
1,3,60,160
0,1,30,20
1,3,160,60
I can simply cut "close_mask" to half so I'll have only 1 iteration, but that does not seems to be efficient..
any ideas?
Also, what should I change to support 3D arrays as well?
Thanks again,
Nissim.
Send NumPy-Discussion mailing list submissions to
To subscribe or unsubscribe via the World Wide Web, visit
or, via email, send a message with subject or body 'help' to
You can reach the person managing the list at
When replying, please edit your Subject line so it is more specific than "Re: Contents of NumPy-Discussion digest..."
Today's Topics:
1. Compare NumPy arrays with threshold and return the
differences (Nissim Derdiger)
2. Re: Compare NumPy arrays with threshold and return the
differences (Paul Hobson)
3. Re: Compare NumPy arrays with threshold and return the
differences (Robert Kern)
----------------------------------------------------------------------
Message: 1
Date: Wed, 17 May 2017 16:50:40 +0000
Subject: [Numpy-discussion] Compare NumPy arrays with threshold and
return the differences
Message-ID:
Content-Type: text/plain; charset="us-ascii"
Hi,
In my script, I need to compare big NumPy arrays (2D or 3D), and return a list of all cells with difference bigger than a defined threshold.
The compare itself can be done easily done with "allclose" function, like that:
Threshold = 0.1
if (np.allclose(Arr1, Arr2, Threshold, equal_nan=True)):
Print('Same')
But this compare does not return which cells are not the same.
The easiest (yet naive) way to know which cells are not the same is to use a simple for loops code like this one:
def CheckWhichCellsAreNotEqualInArrays(Arr1,Arr2,Threshold):
if not Arr1.shape == Arr2.shape:
return ['Arrays size not the same']
Dimensions = Arr1.shape
Diff = []
for i in range(Dimensions [0]):
for j in range(Dimensions [1]):
if not np.allclose(Arr1[i][j], Arr2[i][j], Threshold, equal_nan=True):
Diff.append(',' + str(i) + ',' + str(j) + ',' + str(Arr1[i,j]) + ','
+ str(Arr2[i,j]) + ',' + str(Threshold) + ',Fail\n')
return Diff
(and same for 3D arrays - with 1 more for loop) This way is very slow when the Arrays are big and full of none-equal cells.
Is there a fast straight forward way in case they are not the same - to get a list of the uneven cells? maybe some built-in function in the NumPy itself?
Thanks!
Nissim
-------------- next part --------------
An HTML attachment was scrubbed...
------------------------------
Message: 2
Date: Wed, 17 May 2017 10:13:46 -0700
Subject: Re: [Numpy-discussion] Compare NumPy arrays with threshold
and return the differences
Message-ID:
Content-Type: text/plain; charset="utf-8"
I would do something like:
diff_is_large = (array1 - array2) > threshold index_at_large_diff = numpy.nonzero(diff_is_large)
array1[index_at_large_diff].tolist()
wrote:
> Hi,
> In my script, I need to compare big NumPy arrays (2D or 3D), and
> return a list of all cells with difference bigger than a defined threshold.
> The compare itself can be done easily done with "allclose" function,
> like
> that:
> Threshold = 0.1
> if (np.allclose(Arr1, Arr2, Threshold, equal_nan=True)):
> Print('Same')
> But this compare does not return *which* cells are not the same.
>
> The easiest (yet naive) way to know which cells are not the same is to
> use a simple for loops code like this one:
> def CheckWhichCellsAreNotEqualInArrays(Arr1,Arr2,Threshold):
> if not Arr1.shape == Arr2.shape:
> return ['Arrays size not the same']
> Dimensions = Arr1.shape
> Diff = []
> for i in range(Dimensions [0]):
> for j in range(Dimensions [1]):
> if not np.allclose(Arr1[i][j], Arr2[i][j], Threshold,
> equal_nan=True):
> Diff.append(',' + str(i) + ',' + str(j) + ',' +
> str(Arr1[i,j]) + ','
> + str(Arr2[i,j]) + ',' + str(Threshold) + ',Fail\n')
> return Diff
> (and same for 3D arrays - with 1 more for loop) This way is very slow
> when the Arrays are big and full of none-equal cells.
>
> Is there a fast straight forward way in case they are not the same -
> to get a list of the uneven cells? maybe some built-in function in the
> NumPy itself?
> Thanks!
> Nissim
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
------------------------------
Message: 3
Date: Wed, 17 May 2017 10:16:09 -0700
Subject: Re: [Numpy-discussion] Compare NumPy arrays with threshold
and return the differences
Message-ID:
Content-Type: text/plain; charset="utf-8"
wrote:
> Hi,
> In my script, I need to compare big NumPy arrays (2D or 3D), and
> return a list of all cells with difference bigger than a defined threshold.
> The compare itself can be done easily done with "allclose" function,
> like
> that:
> Threshold = 0.1
> if (np.allclose(Arr1, Arr2, Threshold, equal_nan=True)):
> Print('Same')
> But this compare does not return *which* cells are not the same.
>
> The easiest (yet naive) way to know which cells are not the same is to
> use a simple for loops code like this one:
> def CheckWhichCellsAreNotEqualInArrays(Arr1,Arr2,Threshold):
> if not Arr1.shape == Arr2.shape:
> return ['Arrays size not the same']
> Dimensions = Arr1.shape
> Diff = []
> for i in range(Dimensions [0]):
> for j in range(Dimensions [1]):
> if not np.allclose(Arr1[i][j], Arr2[i][j], Threshold,
> equal_nan=True):
> Diff.append(',' + str(i) + ',' + str(j) + ',' +
> str(Arr1[i,j]) + ','
> + str(Arr2[i,j]) + ',' + str(Threshold) + ',Fail\n')
> return Diff
> (and same for 3D arrays - with 1 more for loop) This way is very slow
> when the Arrays are big and full of none-equal cells.
>
> Is there a fast straight forward way in case they are not the same -
> to get a list of the uneven cells? maybe some built-in function in the
> NumPy itself?
>
Use `close_mask = np.isclose(Arr1, Arr2, Threshold, equal_nan=True)` to return a boolean mask the same shape as the arrays which is True where the elements are close and False where they are not. You
can invert it to get a boolean mask which is True where they are "far" with respect to the
threshold: `far_mask = ~close_mask`. Then you can use `i_idx, j_idx = np.nonzero(far_mask)` to get arrays of the `i` and `j` indices where the values are far. For example:
for i, j in zip(i_idx, j_idx):
print("{0}, {1}, {2}, {3}, {4}, Fail".format(i, j, Arr1[i, j], Arr2[i, j], Threshold))
--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
------------------------------
Subject: Digest Footer
_______________________________________________
NumPy-Discussion mailing list
------------------------------
End of NumPy-Discussion Digest, Vol 128, Issue 18
*************************************************