[Numpy-discussion] Why are empty arrays False?
Eric Firing
efiring at hawaii.edu
Fri Aug 18 22:34:02 EDT 2017
On 2017/08/18 11:45 AM, Michael Lamparski wrote:
> Greetings, all. I am troubled.
>
> The TL;DR is that `bool(array([])) is False` is misleading, dangerous,
> and unnecessary. Let's begin with some examples:
>
> >>> bool(np.array(1))
> True
> >>> bool(np.array(0))
> False
> >>> bool(np.array([0, 1]))
> ValueError: The truth value of an array with more than one element is
> ambiguous. Use a.any() or a.all()
> >>> bool(np.array([1]))
> True
> >>> bool(np.array([0]))
> False
> >>> bool(np.array([]))
> False
>
> One of these things is not like the other.
>
> The first three results embody a design that is consistent with some of
> the most fundamental design choices in numpy, such as the choice to have
> comparison operators like `==` work elementwise. And it is the only
> such design I can think of that is consistent in all edge cases. (see
> footnote 1)
>
> The next two examples (involving arrays of shape (1,)) are a
> straightforward extension of the design to arrays that are isomorphic to
> scalars. I can't say I recall ever finding a use for this feature...
> but it seems fairly harmless.
>
> So how about that last example, with array([])? Well... it's /kind of/
> like how other python containers work, right? Falseness is emptiness
> (see footnote 2)... Except that this is actually *a complete lie*, due
> to /all of the other examples above/!
I don't agree. I think the consistency between bool([]) and
bool(array([])) is worth preserving. Nothing you have shown is
inconsistent with "Falseness is emptiness", which is quite fundamental
in Python. The inconsistency is in distinguishing between 1 element and
more than one element. To be consistent, bool(array([0])) and
bool(array([0, 1])) should both be True. Contrary to the ValueError
message, there need be no ambiguity, any more than there is an ambiguity
in bool([1, 2]).
Eric
>
> Here's what I would like to see:
>
> >>> bool(np.array([]))
> ValueError: The truth value of a non-scalar array is ambiguous. Use
> a.any() or a.all()
>
> Why do I care? Well, I myself wasted an hour barking up the wrong tree
> while debugging some code when it turned out that I was mistakenly using
> truthiness to identify empty arrays. It just so happened that the arrays
> always contained 1 or 0 elements, so it /appeared/ to work except in the
> rare case of array([0]) where things suddenly exploded.
>
> I posit that there is no usage of the fact that `bool(array([])) is
> False` in any real-world code which is not accompanied by a horrible bug
> writhing in hiding just beneath the surface. For this reason, I wish to
> see this behavior *abolished*.
>
> Thank you.
> -Michael
>
> Footnotes:
> 1: Every now and then, I wish that `ndarray.__{bool,nonzero}__` would
> just implicitly do `all()`, which would make `if a == b:` work like it
> does for virtually every other reasonably-designed type in existence.
> But then I recall that, if this were done, then the behavior of `if a !=
> b:` would stand out like a sore thumb instead. Truly, punting on
> 'any/all' was the right choice.
>
> 2: np.array([[[[]]]]) is also False, which makes this an interesting
> sort of n-dimensional emptiness test; but if that's really what you're
> looking for, you can achieve this much more safely with
> `np.all(x.shape)` or `bool(x.flat)`
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
More information about the NumPy-Discussion
mailing list