[Numpy-discussion] nan_to_num and bool arrays

Fri Dec 11 19:06:01 EST 2009

On Fri, Dec 11, 2009 at 17:44, Keith Goodman <kwgoodman at gmail.com> wrote:
> On Fri, Dec 11, 2009 at 2:22 PM, Robert Kern <robert.kern at gmail.com> wrote:
>> On Fri, Dec 11, 2009 at 16:09, Keith Goodman <kwgoodman at gmail.com> wrote:
>>> On Fri, Dec 11, 2009 at 1:14 PM, Robert Kern <robert.kern at gmail.com> wrote:
>>>> On Fri, Dec 11, 2009 at 14:41, Keith Goodman <kwgoodman at gmail.com> wrote:
>>>>> On Fri, Dec 11, 2009 at 12:08 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>>>>
>>>>>> So I agree that it should leave the input untouched when a non-float
>>>>>> dtype is used for some array-like input.
>>>>>
>>>>> Would only one line need to be changed? Would changing
>>>>>
>>>>> if not issubclass(t, _nx.integer):
>>>>>
>>>>> to
>>>>>
>>>>> if not issubclass(t, _nx.integer) and not issubclass(t, _nx.bool_):
>>>>>
>>>>> do the trick?
>>>>
>>>> That still leaves strings, voids, and objects. I recommend:
>>>>
>>>>  if issubclass(t, _nx.inexact):
>>>>
>>>> Arguably, one should handle nan float objects in object arrays and
>>>> float columns in structured arrays, but the current code does not
>>>> handle either of those anyways.
>>>
>>> Without your change both
>>>
>>>>> np.nan_to_num(np.array([True, False]))
>>>>> np.nan_to_num([1])
>>>
>>> raise exceptions. With your change:
>>>
>>>>> np.nan_to_num(np.array([True, False]))
>>>   array([ True, False], dtype=bool)
>>>>> np.nan_to_num([1])
>>>   array([1])
>>
>> I think this is correct, though the latter one happens by accident.
>> Lists don't have a .dtype attribute so obj2sctype(type([1])) is
>> checked and happens to be object_. The latter line is intended to
>> handle scalars, not sequences. I think that sequences should be
>> coerced to arrays for output and this check should be more explicit
>> about what it handles. [1.0] will have a problem if you don't.
>
> That makes sense. But I'm not smart enough to implement it.

Something like the following at the top should help distinguish the
various cases.:

is_scalar = False
if not isinstance(x, _nx.ndarray):
    x = np.asarray(x)
    if x.shape == ():
        # Must return this as a scalar later.
        is_scalar = True
old_shape = x.shape
if x.shape == ():
    # We need element access.
    x.shape = (1,)
t = x.dtype.type

This should allow one to pass in [np.inf] and have it correctly get
interpreted as a float array rather than an object scalar.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco