[Numpy-discussion] Why are empty arrays False?

Nathaniel Smith njs at pobox.com
Sat Aug 19 17:24:26 EDT 2017

On Fri, Aug 18, 2017 at 7:34 PM, Eric Firing <efiring at hawaii.edu> wrote:
> I don't agree.  I think the consistency between bool([]) and bool(array([]))
> is worth preserving.  Nothing you have shown is inconsistent with "Falseness
> is emptiness", which is quite fundamental in Python.  The inconsistency is
> in distinguishing between 1 element and more than one element.  To be
> consistent, bool(array([0])) and bool(array([0, 1])) should both be True.
> Contrary to the ValueError message, there need be no ambiguity, any more
> than there is an ambiguity in bool([1, 2]).

Yeah, this is a mess. But we're definitely not going to make
bool(array([0])) be True. That would break tons of code that currently
relies on the current behavior. And the current behavior does make
sense, in every case except empty arrays: bool broadcasts over the
array, and then, oh shoot, Python requires that bool's return value be
a scalar, so if this results in anything besides an array of size 1,
raise an error.

OTOH you can't really write code that depends on using the current
bool(array([])) semantics for emptiness checking, unless the only two
cases you care about are "empty" and "non-empty with exactly one
element and that element is truthy". So it's much less likely that
changing that will break existing code, plus any code that does break
was already likely broken in subtle ways.

The consistency-with-Python argument cuts two ways: if an array is a
container, then for consistency bool should do emptiness checking. If
an array is a bunch of scalars with broadcasting, then for consistency
bool should do truthiness checking on the individual elements and
raise an error on any array with size != 1. So we can't just rely on
consistency-with-Python to resolve the argument -- we need to pick one
:-). Though internal consistency within numpy would argue for the
latter option, because numpy almost always prefers the bag-of-scalars
semantics over the container semantics, e.g. for + and *, like Eric
Wieser mentioned. Though there are exceptions like iteration.

...Though actually, iteration and indexing by scalars tries to be
consistent with Python in yet a third way. They pretend that an array
is a unidimensional container holding a bunch of arrays:

In [3]: np.array([[1]])[0]
Out[3]: array([1])

In [4]: next(iter(np.array([[1]])))
Out[4]: array([1])

So according to this model, bool(np.array([])) should be False, but
bool(np.array([[]])) should be True (note that with lists, bool([[]])
is True). But alas:

In [5]: bool(np.array([])), bool(np.array([[]]))
Out[5]: (False, False)


Nathaniel J. Smith -- https://vorpus.org

More information about the NumPy-Discussion mailing list