[Numpy-discussion] Why are empty arrays False?

Fri Aug 18 22:34:02 EDT 2017

On 2017/08/18 11:45 AM, Michael Lamparski wrote:
> Greetings, all.  I am troubled.
> 
> The TL;DR is that `bool(array([])) is False` is misleading, dangerous, 
> and unnecessary. Let's begin with some examples:
> 
>  >>> bool(np.array(1))
> True
>  >>> bool(np.array(0))
> False
>  >>> bool(np.array([0, 1]))
> ValueError: The truth value of an array with more than one element is 
> ambiguous. Use a.any() or a.all()
>  >>> bool(np.array([1]))
> True
>  >>> bool(np.array([0]))
> False
>  >>> bool(np.array([]))
> False
> 
> One of these things is not like the other.
> 
> The first three results embody a design that is consistent with some of 
> the most fundamental design choices in numpy, such as the choice to have 
> comparison operators like `==` work elementwise.  And it is the only 
> such design I can think of that is consistent in all edge cases. (see 
> footnote 1)
> 
> The next two examples (involving arrays of shape (1,)) are a 
> straightforward extension of the design to arrays that are isomorphic to 
> scalars.  I can't say I recall ever finding a use for this feature... 
> but it seems fairly harmless.
> 
> So how about that last example, with array([])?  Well... it's /kind of/ 
> like how other python containers work, right? Falseness is emptiness 
> (see footnote 2)...  Except that this is actually *a complete lie*, due 
> to /all of the other examples above/!

I don't agree.  I think the consistency between bool([]) and 
bool(array([])) is worth preserving.  Nothing you have shown is 
inconsistent with "Falseness is emptiness", which is quite fundamental 
in Python.  The inconsistency is in distinguishing between 1 element and 
more than one element.  To be consistent, bool(array([0])) and 
bool(array([0, 1])) should both be True.  Contrary to the ValueError 
message, there need be no ambiguity, any more than there is an ambiguity 
in bool([1, 2]).

Eric

> 
> Here's what I would like to see:
> 
>  >>> bool(np.array([]))
> ValueError: The truth value of a non-scalar array is ambiguous. Use 
> a.any() or a.all()
> 
> Why do I care?  Well, I myself wasted an hour barking up the wrong tree 
> while debugging some code when it turned out that I was mistakenly using 
> truthiness to identify empty arrays. It just so happened that the arrays 
> always contained 1 or 0 elements, so it /appeared/ to work except in the 
> rare case of array([0]) where things suddenly exploded.
> 
> I posit that there is no usage of the fact that `bool(array([])) is 
> False` in any real-world code which is not accompanied by a horrible bug 
> writhing in hiding just beneath the surface. For this reason, I wish to 
> see this behavior *abolished*.
> 
> Thank you.
> -Michael
> 
> Footnotes:
> 1: Every now and then, I wish that `ndarray.__{bool,nonzero}__` would 
> just implicitly do `all()`, which would make `if a == b:` work like it 
> does for virtually every other reasonably-designed type in existence.  
> But then I recall that, if this were done, then the behavior of `if a != 
> b:` would stand out like a sore thumb instead.  Truly, punting on 
> 'any/all' was the right choice.
> 
> 2: np.array([[[[]]]]) is also False, which makes this an interesting 
> sort of n-dimensional emptiness test; but if that's really what you're 
> looking for, you can achieve this much more safely with 
> `np.all(x.shape)` or `bool(x.flat)`
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>