[Numpy-discussion] Boolean binary '-' operator
Charles R Harris
charlesr.harris at gmail.com
Mon Jun 26 21:56:22 EDT 2017
On Mon, Jun 26, 2017 at 6:14 PM, Juan Nunez-Iglesias <jni.soma at gmail.com>
> OMG deprecating + would be a nightmare. I can’t even begin to count the
> number of times I’ve used e.g. np.sum(arr == num)… Originally with a dtype
> cast but generally I’ve removed it because it worked.
> … But I just saw the behaviour of `sum` is different from that of adding
> arrays together (where it indeed means `or`), which I agree is confusing.
> As long as the sum and mean behaviours are unchanged, I won’t raise too
> much of a fuss. =P
> Generally, although one might expect xor, what *I* would expect is for the
> behaviour to match the Python bool type, which is not the case right now.
> So my vote would be to modify ***in NumPy 2.0*** the behaviour of + and -
> to match Python’s built-in bool (ie upcasting to int).
> And, in general, I’m in favour of something as foundational as NumPy, in
> version 1.x, to follow semantic versioning and not break APIs until 2.x.
> On 27 Jun 2017, 9:25 AM +1000, Nathaniel Smith <njs at pobox.com>, wrote:
> On Sun, Jun 25, 2017 at 9:45 AM, Stefan van der Walt
> <stefanv at berkeley.edu> wrote:
> Hi Chuck
> On Sun, Jun 25, 2017, at 09:32, Charles R Harris wrote:
> The boolean binary '-' operator was deprecated back in NumPy 1.9 and
> to an error in 1.13. This caused a number of failures in downstream
> projects. The choices now are to continue the deprecation for another
> of releases, or simply give up on the change. For booleans, `a - b` was
> implemented as `a xor b`, which leads to the somewhat unexpected identity
> - b == b - a`, but it is a handy operator that allows simplification of
> functions, `numpy.diff` among therm. At this point I'm inclined to give up
> on the deprecation and retain the old behavior. It is a bit impure but
> perhaps we can consider it a feature rather than a bug.
> What was the original motivation behind the deprecation? `xor` seems like
> exactly what one would expect when subtracting boolean arrays.
> But, in principle, I'm not against the deprecation (we've had to fix a few
> problems that arose in skimage, but nothing big).
> I believe that this happened as part of a review of the whole
> arithmetic system for np.bool_. Traditionally, we have + is "or",
> binary - is "xor", and unary - is "not".
> Here are some identities you might expect, if 'a' and 'b' are np.bool_
> a - b = a + (-b)
> a + b - b = a
> bool(a + b) = bool(a) + bool(b)
> bool(a - b) = bool(a) - bool(b)
> bool(-a) = -bool(a)
> But in fact none of these identities hold. Furthermore, the np.bool_
> arithmetic operations are all confusing synonyms for operations that
> could be written more clearly using the proper boolean operators |, ^,
> ~, so they violate TOOWTDI. So I think the general idea was to
> deprecate all of this nonsense.
> It looks like what actually happened is that binary - and unary - got
> deprecated a while back and are now raising errors in 1.13.0, but +
> did not. This is sort of unfortunate, because binary - is the only one
> of these that's somewhat defensible (it doesn't match the builtin bool
> type, but it does at least correspond to subtraction in Z/2, so
> identities like 'a - (b - b) = a' do hold).
That's because xor corresponds to addition in Z/2 and every element is its
own additive inverse.
> I guess my preference would be:
> 1) deprecate +
> 2) move binary - back to deprecated-but-not-an-error
> 3) fix np.diff to use logical_xor when the inputs are boolean, since
> that seems to be what people expect
> 4) keep unary - as an error
> And if we want to be less aggressive, then a reasonable alternative would
> 1) deprecate +
> 2) un-deprecate binary -
> 3) keep unary - as an error
Using '+' for 'or' and '*' for 'and' is pretty common and the variation of
'+' for 'xor' was common back in the day because 'and' and 'xor' make
boolean algebra a ring, which appealed to mathematicians as opposed to
everyone else ;) You can see the same progression in measure theory where
eventually intersection and xor (symmetric difference) was replaced with
union and complement. Using '-' for xor is something I hadn't seen outside
of numpy, but I suspect it must be standard somewhere. I would leave '*'
and '+' alone, as the breakage and inconvenience from removing them would
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion