Hi again,
Thanks for the responses to my question!
Roberts answer worked very well for me, except for 1 small issue:
 
This line:
close_mask = np.isclose(MatA, MatB, Threshold, equal_nan=True)
returns each difference twice – once j in compare to I and once for I in compare to j
 
for example:
 
for this input:
MatA = [[10,20,30],[40,50,60]]
MatB = [[10,30,30],[40,50,160]]
 
My old code will return:
0,1,20,30
1,3,60,160
You code returns:
0,1,20,30
1,3,60,160
0,1,30,20
1,3,160,60
 
 
I can simply cut "close_mask" to half so I'll have only 1 iteration, but that does not seems to be efficient..
any ideas?
 
 
 
Also, what should I change to support 3D arrays as well?
 
 
Thanks again,
Nissim.
 
 
 
 
-----Original Message-----
From: NumPy-Discussion [mailto:numpy-discussion-bounces+nissimd=elspec-ltd.com@python.org] On Behalf Of numpy-discussion-request@python.org
Sent: Wednesday, May 17, 2017 8:17 PM
To: numpy-discussion@python.org
Subject: NumPy-Discussion Digest, Vol 128, Issue 18
 
Send NumPy-Discussion mailing list submissions to
        numpy-discussion@python.org
 
To subscribe or unsubscribe via the World Wide Web, visit
        https://mail.python.org/mailman/listinfo/numpy-discussion
or, via email, send a message with subject or body 'help' to
        numpy-discussion-request@python.org
 
You can reach the person managing the list at
        numpy-discussion-owner@python.org
 
When replying, please edit your Subject line so it is more specific than "Re: Contents of NumPy-Discussion digest..."
 
 
Today's Topics:
 
   1. Compare NumPy arrays with threshold and return the
      differences (Nissim Derdiger)
   2. Re: Compare NumPy arrays with threshold and return the
      differences (Paul Hobson)
   3. Re: Compare NumPy arrays with threshold and return the
      differences (Robert Kern)
 
 
----------------------------------------------------------------------
 
Message: 1
Date: Wed, 17 May 2017 16:50:40 +0000
From: Nissim Derdiger <NissimD@elspec-ltd.com>
To: "numpy-discussion@python.org" <numpy-discussion@python.org>
Subject: [Numpy-discussion] Compare NumPy arrays with threshold and
        return the differences
Message-ID:
        <9EFE3345170EF24DB67C61C1B05EEEDB4073F384@EX10.Elspec.local>
Content-Type: text/plain; charset="us-ascii"
 
Hi,
 
In my script, I need to compare big NumPy arrays (2D or 3D), and return a list of all cells with difference bigger than a defined threshold.
The compare itself can be done easily done with "allclose" function, like that:
Threshold = 0.1
if (np.allclose(Arr1, Arr2, Threshold, equal_nan=True)):
    Print('Same')
 
But this compare does not return which cells are not the same.
 
The easiest (yet naive) way to know which cells are not the same is to use a simple for loops code like this one:
def CheckWhichCellsAreNotEqualInArrays(Arr1,Arr2,Threshold):
   if not Arr1.shape == Arr2.shape:
       return ['Arrays size not the same']
   Dimensions = Arr1.shape
   Diff = []
   for i in range(Dimensions [0]):
       for j in range(Dimensions [1]):
           if not np.allclose(Arr1[i][j], Arr2[i][j], Threshold, equal_nan=True):
               Diff.append(',' + str(i) + ',' + str(j) + ',' + str(Arr1[i,j]) + ','
               + str(Arr2[i,j]) + ',' + str(Threshold) + ',Fail\n')
       return Diff
 
(and same for 3D arrays - with 1 more for loop) This way is very slow when the Arrays are big and full of none-equal cells.
 
Is there a fast straight forward way in case they are not the same - to get a list of the uneven cells? maybe some built-in function in the NumPy itself?
Thanks!
Nissim
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170517/a8bfd324/attachment-0001.html>
 
------------------------------
 
Message: 2
Date: Wed, 17 May 2017 10:13:46 -0700
From: Paul Hobson <pmhobson@gmail.com>
To: Discussion of Numerical Python <numpy-discussion@python.org>
Subject: Re: [Numpy-discussion] Compare NumPy arrays with threshold
        and return the differences
Message-ID:
        <CADT3MEABot==+z_iL7qkzim0rDM+0hN4kP4W-veKeoqEW2pDrA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
 
I would do something like:
 
diff_is_large = (array1 - array2) > threshold index_at_large_diff = numpy.nonzero(diff_is_large)
array1[index_at_large_diff].tolist()
 
 
On Wed, May 17, 2017 at 9:50 AM, Nissim Derdiger <NissimD@elspec-ltd.com>
wrote:
 
> Hi,
> In my script, I need to compare big NumPy arrays (2D or 3D), and
> return a list of all cells with difference bigger than a defined threshold.
> The compare itself can be done easily done with "allclose" function,
> like
> that:
> Threshold = 0.1
> if (np.allclose(Arr1, Arr2, Threshold, equal_nan=True)):
>     Print('Same')
> But this compare does not return *which* cells are not the same.
>
> The easiest (yet naive) way to know which cells are not the same is to
> use a simple for loops code like this one:
> def CheckWhichCellsAreNotEqualInArrays(Arr1,Arr2,Threshold):
>    if not Arr1.shape == Arr2.shape:
>        return ['Arrays size not the same']
>    Dimensions = Arr1.shape
>    Diff = []
>    for i in range(Dimensions [0]):
>        for j in range(Dimensions [1]):
>            if not np.allclose(Arr1[i][j], Arr2[i][j], Threshold,
> equal_nan=True):
>                Diff.append(',' + str(i) + ',' + str(j) + ',' +
> str(Arr1[i,j]) + ','
>                + str(Arr2[i,j]) + ',' + str(Threshold) + ',Fail\n')
>        return Diff
> (and same for 3D arrays - with 1 more for loop) This way is very slow
> when the Arrays are big and full of none-equal cells.
>
> Is there a fast straight forward way in case they are not the same -
> to get a list of the uneven cells? maybe some built-in function in the
> NumPy itself?
> Thanks!
> Nissim
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170517/6183339c/attachment-0001.html>
 
------------------------------
 
Message: 3
Date: Wed, 17 May 2017 10:16:09 -0700
From: Robert Kern <robert.kern@gmail.com>
To: Discussion of Numerical Python <numpy-discussion@python.org>
Subject: Re: [Numpy-discussion] Compare NumPy arrays with threshold
        and return the differences
Message-ID:
        <CAF6FJisn3Oj18HOOP-DJGOi7rTwr-1U4npef+wCd=ENnMkMFmw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
 
On Wed, May 17, 2017 at 9:50 AM, Nissim Derdiger <NissimD@elspec-ltd.com>
wrote:
 
> Hi,
> In my script, I need to compare big NumPy arrays (2D or 3D), and
> return a list of all cells with difference bigger than a defined threshold.
> The compare itself can be done easily done with "allclose" function,
> like
> that:
> Threshold = 0.1
> if (np.allclose(Arr1, Arr2, Threshold, equal_nan=True)):
>     Print('Same')
> But this compare does not return *which* cells are not the same.
>
> The easiest (yet naive) way to know which cells are not the same is to
> use a simple for loops code like this one:
> def CheckWhichCellsAreNotEqualInArrays(Arr1,Arr2,Threshold):
>    if not Arr1.shape == Arr2.shape:
>        return ['Arrays size not the same']
>    Dimensions = Arr1.shape
>    Diff = []
>    for i in range(Dimensions [0]):
>        for j in range(Dimensions [1]):
>            if not np.allclose(Arr1[i][j], Arr2[i][j], Threshold,
> equal_nan=True):
>                Diff.append(',' + str(i) + ',' + str(j) + ',' +
> str(Arr1[i,j]) + ','
>                + str(Arr2[i,j]) + ',' + str(Threshold) + ',Fail\n')
>        return Diff
> (and same for 3D arrays - with 1 more for loop) This way is very slow
> when the Arrays are big and full of none-equal cells.
>
> Is there a fast straight forward way in case they are not the same -
> to get a list of the uneven cells? maybe some built-in function in the
> NumPy itself?
>
 
Use `close_mask = np.isclose(Arr1, Arr2, Threshold, equal_nan=True)` to return a boolean mask the same shape as the arrays which is True where the elements are close and False where they are not. You can invert it to get a boolean mask which is True where they are "far" with respect to the
threshold: `far_mask = ~close_mask`. Then you can use `i_idx, j_idx = np.nonzero(far_mask)` to get arrays of the `i` and `j` indices where the values are far. For example:
 
for i, j in zip(i_idx, j_idx):
    print("{0}, {1}, {2}, {3}, {4}, Fail".format(i, j, Arr1[i, j], Arr2[i, j], Threshold))
 
--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170517/3d57f695/attachment.html>
 
------------------------------
 
Subject: Digest Footer
 
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
 
 
------------------------------
 
End of NumPy-Discussion Digest, Vol 128, Issue 18
*************************************************