[Numpy-discussion] Why are empty arrays False?

Fri Aug 18 20:12:43 EDT 2017

On Fri, Aug 18, 2017 at 2:45 PM, Michael Lamparski
<diagonaldevice at gmail.com> wrote:
> Greetings, all.  I am troubled.
>
> The TL;DR is that `bool(array([])) is False` is misleading, dangerous, and
> unnecessary. Let's begin with some examples:
>
>>>> bool(np.array(1))
> True
>>>> bool(np.array(0))
> False
>>>> bool(np.array([0, 1]))
> ValueError: The truth value of an array with more than one element is
> ambiguous. Use a.any() or a.all()
>>>> bool(np.array([1]))
> True
>>>> bool(np.array([0]))
> False
>>>> bool(np.array([]))
> False
>
> One of these things is not like the other.
>
> The first three results embody a design that is consistent with some of the
> most fundamental design choices in numpy, such as the choice to have
> comparison operators like `==` work elementwise.  And it is the only such
> design I can think of that is consistent in all edge cases. (see footnote 1)
>
> The next two examples (involving arrays of shape (1,)) are a straightforward
> extension of the design to arrays that are isomorphic to scalars.  I can't
> say I recall ever finding a use for this feature... but it seems fairly
> harmless.
>
> So how about that last example, with array([])?  Well... it's /kind of/ like
> how other python containers work, right? Falseness is emptiness (see
> footnote 2)...  Except that this is actually *a complete lie*, due to /all
> of the other examples above/!

Yeah, numpy tries to follow Python conventions, except sometimes you
run into these cases where it's trying to simultaneously follow two
incompatible extensions and things get... problematic.

> Here's what I would like to see:
>
>>>> bool(np.array([]))
> ValueError: The truth value of a non-scalar array is ambiguous. Use a.any()
> or a.all()
>
> Why do I care?  Well, I myself wasted an hour barking up the wrong tree
> while debugging some code when it turned out that I was mistakenly using
> truthiness to identify empty arrays. It just so happened that the arrays
> always contained 1 or 0 elements, so it /appeared/ to work except in the
> rare case of array([0]) where things suddenly exploded.

Yeah, we should probably deprecate and remove this (though it will
take some time).

> 2: np.array([[[[]]]]) is also False, which makes this an interesting sort of
> n-dimensional emptiness test; but if that's really what you're looking for,
> you can achieve this much more safely with `np.all(x.shape)` or
> `bool(x.flat)`

x.size is also useful for emptiness checking.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org