[Numpy-discussion] Boolean binary '-' operator

Nathaniel Smith njs at pobox.com
Mon Jun 26 19:25:04 EDT 2017


On Sun, Jun 25, 2017 at 9:45 AM, Stefan van der Walt
<stefanv at berkeley.edu> wrote:
> Hi Chuck
>
> On Sun, Jun 25, 2017, at 09:32, Charles R Harris wrote:
>
>> The boolean binary '-' operator was deprecated back in NumPy 1.9 and changed
>> to an error in 1.13. This caused a number of failures in downstream
>> projects. The choices now are to continue the deprecation for another couple
>> of releases, or simply give up on the change. For booleans,  `a - b` was
>> implemented as `a xor b`, which leads to the somewhat unexpected identity `a
>> - b == b - a`, but it is a handy operator that allows simplification of some
>> functions, `numpy.diff` among therm. At this point I'm inclined to give up
>> on the deprecation and retain the old behavior. It is a bit impure but
>> perhaps we can consider it a feature rather than a bug.
>
>
> What was the original motivation behind the deprecation?  `xor` seems like
> exactly what one would expect when subtracting boolean arrays.
>
> But, in principle, I'm not against the deprecation (we've had to fix a few
> problems that arose in skimage, but nothing big).

I believe that this happened as part of a review of the whole
arithmetic system for np.bool_. Traditionally, we have + is "or",
binary - is "xor", and unary - is "not".

Here are some identities you might expect, if 'a' and 'b' are np.bool_ objects:

a - b = a + (-b)
a + b - b = a
bool(a + b) = bool(a) + bool(b)
bool(a - b) = bool(a) - bool(b)
bool(-a) = -bool(a)

But in fact none of these identities hold. Furthermore, the np.bool_
arithmetic operations are all confusing synonyms for operations that
could be written more clearly using the proper boolean operators |, ^,
~, so they violate TOOWTDI. So I think the general idea was to
deprecate all of this nonsense.

It looks like what actually happened is that binary - and unary - got
deprecated a while back and are now raising errors in 1.13.0, but +
did not. This is sort of unfortunate, because binary - is the only one
of these that's somewhat defensible (it doesn't match the builtin bool
type, but it does at least correspond to subtraction in Z/2, so
identities like 'a - (b - b) = a' do hold).

I guess my preference would be:
1) deprecate +
2) move binary - back to deprecated-but-not-an-error
3) fix np.diff to use logical_xor when the inputs are boolean, since
that seems to be what people expect
4) keep unary - as an error

And if we want to be less aggressive, then a reasonable alternative would be:
1) deprecate +
2) un-deprecate binary -
3) keep unary - as an error

-n

-- 
Nathaniel J. Smith -- https://vorpus.org


More information about the NumPy-Discussion mailing list