type-casting differences for comparisons

I just came across a real-head scratcher that took me a bit to figure out. I don't know if it counts as a bug or not. I have an array with dtype "f4" and a separate python float. Some elements of this array gets assigned this numpy float64 scalar value. (I know, I should be better off with a mask, but bear with me, this is just a demonstration code to isolate the core problem from a much more complicated program...) import numpy as np a = np.empty((500,500), dtype='f4') a[:] = np.random.random(a.shape) bad_val = 10*a.max() b = np.where(a > 0.8, bad_val, a) Now, the following seems to always evaluate to False, as expected:
np.any(b > bad_val)
but, if I am (un-)lucky enough, this will sometimes evaluate to True:
any([(c > bad_val) for c in b.flat])
What it seems to me is that for the first comparison test, bad_val is casted down to float32 (or maybe b is casted up to float64?), but for the second example, the opposite is true. This can lead to some unexpected behaviors. Is there some sort of difference between type-casting of numpy scalars and numpy arrays? I would expect both to be the same. Ben Root

On Thu, Jul 14, 2011 at 15:43, Benjamin Root <ben.root@ou.edu> wrote:
I just came across a real-head scratcher that took me a bit to figure out. I don't know if it counts as a bug or not.
I have an array with dtype "f4" and a separate python float. Some elements of this array gets assigned this numpy float64 scalar value. (I know, I should be better off with a mask, but bear with me, this is just a demonstration code to isolate the core problem from a much more complicated program...)
import numpy as np
a = np.empty((500,500), dtype='f4') a[:] = np.random.random(a.shape) bad_val = 10*a.max()
b = np.where(a > 0.8, bad_val, a)
Now, the following seems to always evaluate to False, as expected:
np.any(b > bad_val)
but, if I am (un-)lucky enough, this will sometimes evaluate to True:
any([(c > bad_val) for c in b.flat])
What it seems to me is that for the first comparison test, bad_val is casted down to float32 (or maybe b is casted up to float64?), but for the second example, the opposite is true. This can lead to some unexpected behaviors. Is there some sort of difference between type-casting of numpy scalars and numpy arrays? I would expect both to be the same.
Remember, the rule for ufuncs is that when the operation is array-scalar, the array dtype wins (barring cross-kind types which aren't relevant here). For array-array and scalar-scalar, the "largest" dtype wins. So for the first case, array-scalar, bad_val gets downcasted to float32. For the latter case, bad_val remains float64 and upcasts c to float64. Try this: bad_val = np.float32(10) * a.max() -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
participants (2)
-
Benjamin Root
-
Robert Kern