Changing the meaning of bool.__invert__

Hello, Booleans currently have reasonable overrides for the bitwise binary operators:
However, the same cannot be said of bitwise unary complement, which returns rather useless integer values:
Numpy's boolean type does the more useful (and more expected) thing:
~np.bool_(True) False
How about changing the behaviour of bool.__invert__ to make it in line with the Numpy boolean? (i.e. bool.__invert__ == operator.not_) Regards Antoine.

On Thu, Apr 7, 2016 at 8:46 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
All those are consistent with bool being a subclass of int, in the sense that (for example) `int(True | False)` is identical to `int(True) | int(False)`. Redefining ~True to be False wouldn't preserve that: int(~True) == ~int(True) would become invalid. In short, the proposal would break the Liskov Substitution Principle. (The obvious fix is of course to make True have value -1 rather than 1. Then everything's consistent. No, I'm not seriously suggesting this - the amount of breakage would be insane.) NumPy has the luxury that numpy.bool_ is *not* a subclass of any integer type. Mark

On Thu, 7 Apr 2016 09:00:36 +0100 Mark Dickinson <dickinsm@gmail.com> wrote:
However, the return type (bool in one place, int in another one) is inconsistent, and the user-visible semantics are confusing... Apparently someone went to the trouble of overriding __and__, __or__ and __xor__ for booleans, which is why it looks unexpected to leave __invert__ alone. Regards Antoine.

On Thu, Apr 07, 2016 at 09:46:18AM +0200, Antoine Pitrou wrote:
Substitute 1 for True and 0 for False, and these results are exactly the same as the bitwise operations on ints. And that works since True is defined to equal 1 and False to equal 0.
Substitute 0 for False and 1 for True, and you get exactly the same results. What else did you expect from bitwise-not?
Expected by whom? I wouldn't expect bitwise-not to be the same as binary not. If I want binary not, I'll spell it `not`.
Why? What problem does this solve? We already have a perfectly good way of spelling binary not, why break backwards compatibility to get a second way to spell it? It also breaks a fundamental property of most mathematical relations: if a == b, then f(a) == f(b) (assuming f(a) and f(b) are defined for the type of both a and b). That is currently true for bools: py> (True == 1) and (~True == ~1) True py> (False==0) and (~False == ~0) True You want ~b to return `not b`: py> (True == 1) and (False == ~1) False py> (False==0) and (True == ~0) False I see no upside and a serious downside to this proposal. -- Steve

Steven D'Aprano <steve@...> writes:
By anyone who takes booleans at face value (that is, takes booleans as representing a truth value and expects operations on booleans to reflect the semantics of useful operations on truth values, not some arbitrary side-effect of the internal representation of a boolean...). But I'm not surprised by such armchair commenting and pointless controversy on python-ideas, since that's what the list is for.... Regards Antoine.

On Thu, Apr 07, 2016 at 01:17:57PM +0000, Antoine Pitrou wrote:
Bools in Python have *always* been integers, so who are these people taking booleans at face value? Beginners? If so, say so. That's a motive I can understand. But I think that people who expect bools to be real truth values, like in Pascal, probably won't expect BITWISE operations to operate on them at all and will use the BOOLEAN operators and, or, not. Bitwise operators operate on a sequence of bits, not a single truth value. What do these naive "bools are truth values" people think: True << 3 should return? I don't think we need a second way to spell "not bool". Bools have always been ints in Python, and apart from their fancy string representation they behave like ints. I don't think it helps to make ~ a special case where they don't.
But I'm not surprised by such armchair commenting and pointless controversy on python-ideas, since that's what the list is for....
Thanks for your feedback, I'll give it the due consideration it deserves. -- Steve

On 7 April 2016 at 08:46, Antoine Pitrou <solipsis@pitrou.net> wrote:
This is all just a consequence of the (unfortunate IMO) design that treats True and False as being equivalent to 1 and 0.
This is a consequence of another unfortunate design by numpy. The reason for this is that numpy uses Python's bitwise operators to do element-wise logical operations. This is because it is not possible to overload Python's 'and', 'or', and 'not' operators. So if I write:
The problem is that Python tries to shortcut this expression and so it calls bool(1 < a):
Returning a non-bool from __bool__ is prohibited which implicitly prevents numpy from overloading an expression like `not a`:
Because of this numpy uses &,|,~ in place of and,or,not for numpy arrays and by extension this occurs for numpy scalars as you showed. I think the numpy project really wanted to use the normal Python operators but no mechanism was available to do it. It would have been nice to use &,|,~ for genuine bitwise operations (on e.g. unsigned int arrays). It also means that chained relations don't work e.g.:
The recommended way is (1 < a) & (a < 4) which not as nice and requires factoring a out if a is actually an expression like sin(x)**2 or something. -- Oscar

On 2016-04-07 13:19, Oscar Benjamin wrote:
On 7 April 2016 at 08:46, Antoine Pitrou <solipsis@pitrou.net> wrote:
This is not correct. & | ^ ~ are all genuine bitwise operations on numpy arrays.
What you are seeing with bool arrays is that bool arrays are *not* just uint8 arrays. Each element happens to take up a single 8-bit byte, but only one of those bits contributes to its value; the other 7 bits are mere padding. The bitwise & | ^ ~ operators all work on that single bit correctly. They do not operate on the padding bits as they are not part of the bool's value. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On 04/07/2016 12:46 AM, Antoine Pitrou wrote:
No. bool is a subclass of int, and changing that now would be a serious breach of backward-compatibility, not to mention breaking existing code for no good reason. Anyone who wants to can create their own Boolean class that doesn't subclass int and then declare the behaviour they want. If bool had been it's own thing from the start this wouldn't have been a problem, but it is far too late to change that now. You would be better off suggesting a new Logical type instead (it could even support unknown values).
__and__, __or__, and __xor__'s results in within bool's domain (right word?) so keeping them in the bool subtype makes sense; the result of __invert__ is not.
But I'm not surprised by such armchair commenting and pointless controversy on python-ideas, since that's what the list is for....
If you aren't going to be civil, don't bother coming back. -- ~Ethan~

Honestly I think that the OP has a point, and I don't think we have to bend over backwards to preserve int compatibility. After all str(True) != str(1), and surely there are other examples. -- --Guido van Rossum (python.org/~guido)

On 04/07/2016 09:38 AM, Guido van Rossum wrote:
I think the str() of a value, while possibly being the most interesting piece of information (IntEnum, anyone?), is hardly the most intrinsic. If we do make this change, besides needing a couple major versions to make it happen, will anything else be different? - no longer subclass int? - add an "unknown" value? - how will indexing work? - or any of the other operations? - don't bother with any of the other mathematical operations? - counting True's is not the same as adding True's I'm not firmly opposed, I just don't see a major issue here -- I've needed an Unknown value for more often that I've needed ~True to be False. -- ~Ethan~

On Apr 7, 2016, at 6:23 PM, Guido van Rossum <guido@python.org> wrote: Nothing else is on the table. Seriously. Stop hijacking the thread.
To clarify, the proposal is: ``~True == False`` but every other operation on ``True`` remains the same, including ``True * 42 == 42``. Correct?

On Thu, Apr 7, 2016 at 10:39 AM, Michael Selik <mike@selik.org> wrote:
Yes. To be more precise, there are some "arithmetic" operations (+, -, *, /, **) and they all treat bools as ints and always return ints; there are also some "bitwise" operations (&, |, ^, ~) and they should all treat bools as bools and return a bool. Currently the only exception to this idea is that ~ returns an int, so the proposal is to fix that. (There are also some "boolean" operations (and, or, not) and they are also unchanged.) -- --Guido van Rossum (python.org/~guido)

On 4/7/2016 2:08 PM, Guido van Rossum wrote:
On Thu, Apr 7, 2016 at 10:39 AM, Michael Selik <mike@selik.org> wrote:
When the proposal is expressed as "Make bools consistently follow this simple rule -- Logical and 'bitwise' operations on bools return the expected bool, while arithmetic operations treat bools as 0 or 1 and return ints.", it makes sense to me. Given that ~bool hardly make any sense currently, I would not expect it to be in much use now. Hence not much to break. -- Terry Jan Reedy

On Thu, Apr 7, 2016, at 16:05, Terry Reedy wrote:
Given that ~bool hardly make any sense currently, I would not expect it to be in much use now. Hence not much to break.
I suspect the fear is of one being passed into a place that expects an int, and staying alive as a bool (i.e. not being converted to an int by an arithmetic operation) long enough to confuse code that is trying to do ~int.

On 7 April 2016 at 21:08, Random832 <random832@fastmail.com> wrote:
That is indeed the only place likely to hit problems. But I'd be surprised if it was sufficiently common to be a major problem. I don't think the backward compatibility constraints on a minor release would preclude a change like this. Personally, I'm +0 on the proposal. It seems like a more useful behaviour, but it's one I'm never likely to need personally. Paul

Terry Reedy wrote:
Given that ~bool hardly make any sense currently, I would not expect it to be in much use now. Hence not much to break.
But conversely, any code that *is* using ~bool instead of "not bool" is probably doing it precisely because it *does* want the integer interpretation. Why break that code, when "not bool" is available as the obvious way of getting a logically negated bool? -- Greg

On Thu, Apr 7, 2016 at 11:54 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
It would be interesting actually if anyone has any idea of how to get some empirical data on this -- it isn't actually clear to me whether this is true or not. The reason I'm uncertain is that in numpy code, using operations like ~ on booleans is *very* common, because the whole idea of numpy is that it gives you a way to write code that works the same on either a single value or on an array of values, and when you're working with booleans then this means you have to use '~': '~' works on arrays and 'not' doesn't. And, for numpy bools or arrays of bools, ~ does logical negation: In [1]: ~np.bool_(True) Out[1]: False So you can write code like: # Contrived function that doubles 'value' if do_double is true # and otherwise halves it def double_or_halve(value, do_double): value = np.asarray(value) value[do_double] *= 2 value[~do_double] *= 0.5 return value and then this works correctly if 'do_double' is a numpy bool or array of bools: In [16]: double_or_halve(np.arange(3, dtype=float), np.array([True, False, True])) Out[16]: array([ 0. , 0.5, 4. ]) In [21]: double_or_halve(5.0, np.bool_(False)) Out[21]: array(2.5) But if you pass in a regular Python bool then the attempt to index by ~do_double turns into negative integer indexing and blows up: In [23]: double_or_halve(5.0, False) IndexError: too many indices for array Of course this is a totally contrived function, and anyway it has a bug -- the user should have said 'do_double = np.asarray(do_double)' at the top of the function, and that would fix the problem. This is definitely not some massive problem afflicting numerical users, and I don't have any strong opinion on Antoine's proposal. But, it is the only case where I can imagine someone intentionally writing ~bool, so it actually strikes me as plausible that the majority of existing code that writes ~bool is like this: doing it by mistake and expecting it to be the same as 'not'. FWIW. -n -- Nathaniel J. Smith -- https://vorpus.org

On Fri, Apr 8, 2016 at 4:08 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
It would, but you can't have a regular protocol for and/or because they're actually not operators, they're control-flow syntax: In [1]: a = True In [2]: a or b Out[2]: True In [3]: a and b NameError: name 'b' is not defined You could define a protocol for __or__/__ror__/__and__/__rand__ despite this, but it would have weird issues, like 'True and array([True, False])' would call ndarray.__rand__ and return array([True, False]), but 'True or array([True, False])' would return True (because it short-circuits and returns before it can even check for the presence of ndarray.__ror__). Given this, it's not clear whether it even makes sense to try. There was a discussion about this on python-ideas a few months ago, and Guido asked whether it would still be useful despite these weird issues, but I dropped the ball and my email to numpy-discussion soliciting feedback on that is still sitting in my drafts folder... And I guess you could have a protocol just for 'not', but there might be some performance concerns (e.g. right now the peephole optimizer actually knows how to optimize 'if not' into a single opcode), and overriding 'not' without overriding 'and' + 'or' is probably more confusing than useful. -n -- Nathaniel J. Smith -- https://vorpus.org

On Thu, Apr 07, 2016 at 11:08:38AM -0700, Guido van Rossum wrote:
You missed two: >> and <<. What are we to do with (True << 1)? Honestly, I cannot even imagine what it means to say "shift a truth value N bits". I think the idea that bitwise operations on bools are actually boolean operations in disguise is not a well-formed idea. Sometimes it happens to work out (& | ^), and sometimes it doesn't (<< and ~). And I'm not sure what to make of >> as an operation on bools. It doesn't *mean* anything, you can't shift a truth value, the very concept is meaningless, but if it did mean something it would surely return False. So >> could go into either category. But ultimately, ~ has meant bitwise-not for 25 years, and it's never caused a problem before, not back in the days when people used to write TRUE, FALSE = 1, 0 and not now. If you want to perform a boolean "not" on a truth value, you use `not`. Nobody cared enough to "fix" this (if it is a problem that needs fixing, which I doubt) when bools were first introduced, and nobody cared when Python 3 came out. So why are we talking about rushing a backwards- incompatible semantic change into a point release? Even if we "fix" this, surely we should go through the usual deprecation process? This isn't a critical security bug that needs fixing, it's a semantic change to something that has worked this way for 25 years, and its going to break something somewhere. There are just far too many people that expect that bools are ints. After all, not withstanding their fancy string representation, they behave like ints and actually are ints. -- Steve

Steven D'Aprano writes:
I'm with Steven on this. Knuth would call these operations "seminumerical". I would put the emphasis on "numerical", *expecting* True and False to be one-bit representations of the (mathematical) integers 1 and 0. If numerical operations widen bools to int and then operate, I would *expect* seminumerical operations to do so as well. In fact, I was startled by Antoine's post. I even have a couple of lines of code using "^" as a *logical* operator on known bools, carefully labeled "# Hack! works only on true bools." That said, I'm not Dutch, and if treating bool as "not actually int" here is the right thing to do, then I would think the easiest thing to do would be to interpret the bitwise operations as performed on (mythical) C "itty-bitty ints".[1] Then ~ does the right thing and True << 1 == True >> 1 == False << 1 == False >> 1 == 0 giving us four new ways to spell 0 as a bonus!
After all, not withstanding their fancy string representation,
I guess "fancy string representation" was the original motivation for the overrides. If the intent was really to make operator versions of logical operators (but only for true bools!), they would have fixed ~ too.
they behave like ints and actually are ints.
I can't fellow-travel all the way to "actually are", though. bools are what we decide to make them. I just don't see why the current behaviors of &|^ are particularly useful, since you'll have to guard all bitwise expressions against non-bool truthies and falsies. Footnotes: [1] "itty-bitty" almost reads "1 bit" in Japanese!

On Fri, Apr 08, 2016 at 02:52:45PM +0900, Stephen J. Turnbull wrote:
No need to guess. There's a PEP: https://www.python.org/dev/peps/pep-0285/
I'm not talking about bools in other languages, or bools in Python in some alternate universe. But in the Python we have right now, bools *are* ints, no ifs, buts or maybes: py> isinstance(True, int) True This isn't an accident of the implementation, it was an explicit BDFL pronouncement in PEP 285: 6) Should bool inherit from int? => Yes. Now I'll certainly admit that bools-are-ints is an accident of history. Had Guido been more influenced by Pascal, say, and less by C, he might have choosen to include a dedicated Boolean type right from the beginning. But he wasn't, and so he didn't, and consequently bools are now ints.
flag ^ flag is useful since we don't have a boolean-xor operator and bitwise-xor does the right thing for bools. And I suppose some people might prefer & and | over boolean-and and boolean-or because they're shorter and require less typing. I don't think that's a particularly good reason for using them, and as you say, you do have to guard against non-bools slipping, but Consenting Adults applies. -- Steve

On Fri, Apr 8, 2016 at 4:40 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Not everyone considers bit shifts 'bitwise', as they don't act at the level of individual bit positions: https://en.wikipedia.org/wiki/Bitwise_operation
One point of view is that bitwise operations should stay within bool, while shifts return ints, the left-shift operations actually being much more useful than "left-pad" ;). The main point of >> can be seen as consistency, although perhaps useless. That said, I don't really have an opinion on the OP's suggestion. -Koos

On Thu, Apr 7, 2016 at 6:40 PM, Steven D'Aprano <steve@pearwood.info> wrote:
You missed two: >> and <<. What are we to do with (True << 1)? [...]
Those are indeed ambiguous -- they are defined as multiplication or floor division with a power of two, e.g. x<<n is x*2**n and x>>n is x//2**n (for integral x and nonnegative n). The point of this thread seems to be to see whether some operations can be made more useful by staying in the bool domain -- I don't think making both of these return 0 if n != 0, so let's keep them unchanged.
The thing here is, this change is too small to warrant a __future__ import. So we're either going to introduce it in 3.6 and tell people about it in case their code might break, or we're never going to do it. I'm honestly on the fence, but I feel this is a rarely used operator so changing its meaning is not likely to break a lot of code. -- --Guido van Rossum (python.org/~guido)

On Fri, 8 Apr 2016 at 08:44 Guido van Rossum <guido@python.org> wrote:
DeprecationWarning every time you use ~ on a bool? That would still be too big a burden on using it the new way.
I think proposal would be a DeprecationWarning to flush out/remove all current uses of ~bool with Python 3.6, and then in Python 3.7 introduce the new semantics. -Brett

On 4/8/2016 11:42 AM, Guido van Rossum wrote:
DeprecationWarning every time you use ~ on a bool?
A DeprecationWarning should only be in the initial version of bool.__invert__, which initially would return int.__invert__ after issuing the warning that we plan to change the meaning. -- Terry Jan Reedy

On Sat, Apr 09, 2016 at 02:16:47PM +0300, Koos Zevenhoven wrote:
Maybe the right warning type would be FutureWarning.
If we accept this proposal -- and I hope we don't -- I think that FutureWarning is the right one to use. It is what was used in 2.3 when the behaviour of ints changed as part of int/long unification. -- Steve

On Sat, Apr 9, 2016 at 2:07 AM, Terry Reedy <tjreedy@udel.edu> wrote:
It seems unusual to deprecate something without also providing a means of using the new thing in the same release. "Don't use this feature because we're going to change what it does in the future. Oh, you want to use the new version? Psych! We haven't actually done anything yet. Use not instead." It creates a weird void in Python 3.6 where the operator still exists but absolutely nobody has a legitimate reason to be using it. What happens if somebody is using ~ for its current semantics, skips the 3.6 release in their upgrade path, and doesn't read the release notes carefully enough? They'll never see the warning and will just experience a silent and difficult-to-diagnose breakage.

On Sat, Apr 09, 2016 at 08:28:10AM -0600, Ian Kelly wrote:
Not really. This is quite similar to what happened in Python 2.3 during int/long unification. The behaviour of certain integer operations changed, including the meaning of some literals, and warnings were displayed. I don't have 2.3 available to demonstrate but I can show you the change in behaviour: [steve@ando ~]$ python1.5 -c "print 0xffffffff" -1 [steve@ando ~]$ python2.4 -c "print 0xffffffff" 4294967295 By memory, 0xffffffff in python2.3 would print a warning that the result will change in the next release, and return -1. See: https://www.python.org/dev/peps/pep-0237/ https://www.python.org/download/releases/2.3.5/notes/
Then they'll be in the same position as everybody if there's no depreciation at all. -- Steve

On Sat, Apr 9, 2016 at 9:25 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Pointing out that this has been done once before, 11 minor releases prior, does not dissuade me from continuing to characterize it as "unusual". The int/long unification was also a much more visible change overall.
I'm not suggesting there should be no deprecation. I'm just questioning whether the proposed deprecation is sufficient.

Let me pronounce something here. This change is not worth the amount of effort and pain a deprecation would cause everyone. Either we change this quietly in 3.6 (adding it to What's New etc. of course) or we don't do it at all. -- --Guido van Rossum (python.org/~guido)

On 10 April 2016 at 07:46, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I have no axe to grind either way, but my impression from this thread is that some people would prefer bool to be consistent with user-defined types (such as numpy's) in this regard - specifically because user-defined types *have* to use ~ as the negation operator because "not" is not overridable in they way they require. Paul

I'm +1 on this change because is makes sense as a user. Note how numpy deals with invert and unsigned integers: In [2]: a = np.uint8(10) In [3]: ~a Out[3]: 245 The result of invert staying within the same type makes sense to me. (Also, as an idealist, I believe that decoupling int and bool might one day many many years from now bring about the ideal of bool not subclassing int.) Best, Neil On Saturday, April 9, 2016 at 12:25:57 PM UTC-4, Guido van Rossum wrote:

Michael Selik wrote:
To clarify, the proposal is: ``~True == False`` but every other operation on ``True`` remains the same, including ``True * 42 == 42``. Correct?
Seems to me things are fine as they are. The justification for & and | on bools returning bools is that the result remains within the domain of bools, even when they are interpreted as int operations. But ~ on a bool-interpreted-as-an-int doesn't have that property, so ~True is more in the realm of True * 42 in that regard. -- Greg

On Thu, Apr 7, 2016 at 10:38 AM, Guido van Rossum <guido@python.org> wrote:
I can see it going either way: if we treat the domain of bool as that of the integers, then ~True == ~1 == -2. If on the other hand we treat it as the integers modulo 2, then it makes sense that ~True == ~1 == 0. But this would also imply that True + True == False, which would definitely break existing code. I note that if you add an explicit modulo division by 2, then it works out: py> ~True % 2 0 py> ~False % 2 1 The salient point to me is that there's no strong justification for making the change. As has been pointed out elsewhere in the thread, if you want binary not, just use not.

On 2016-04-07 08:15, Ethan Furman wrote:
Let's not forget that subclasses don't have to exactly duplicate all the behavior of their superclasses. That's why there's such a thing as overriding. Bool could remain a subclass of int, and still change its __invert__ behavior by overriding __invert__. It's true that this would be a backwards incompatible change, but behavior like ~True==-2 doesn't seem like something a lot of people are relying on. It would be worth looking into how much code actually does rely on it. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

On Thu, Apr 7, 2016 at 8:46 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
All those are consistent with bool being a subclass of int, in the sense that (for example) `int(True | False)` is identical to `int(True) | int(False)`. Redefining ~True to be False wouldn't preserve that: int(~True) == ~int(True) would become invalid. In short, the proposal would break the Liskov Substitution Principle. (The obvious fix is of course to make True have value -1 rather than 1. Then everything's consistent. No, I'm not seriously suggesting this - the amount of breakage would be insane.) NumPy has the luxury that numpy.bool_ is *not* a subclass of any integer type. Mark

On Thu, 7 Apr 2016 09:00:36 +0100 Mark Dickinson <dickinsm@gmail.com> wrote:
However, the return type (bool in one place, int in another one) is inconsistent, and the user-visible semantics are confusing... Apparently someone went to the trouble of overriding __and__, __or__ and __xor__ for booleans, which is why it looks unexpected to leave __invert__ alone. Regards Antoine.

On Thu, Apr 07, 2016 at 09:46:18AM +0200, Antoine Pitrou wrote:
Substitute 1 for True and 0 for False, and these results are exactly the same as the bitwise operations on ints. And that works since True is defined to equal 1 and False to equal 0.
Substitute 0 for False and 1 for True, and you get exactly the same results. What else did you expect from bitwise-not?
Expected by whom? I wouldn't expect bitwise-not to be the same as binary not. If I want binary not, I'll spell it `not`.
Why? What problem does this solve? We already have a perfectly good way of spelling binary not, why break backwards compatibility to get a second way to spell it? It also breaks a fundamental property of most mathematical relations: if a == b, then f(a) == f(b) (assuming f(a) and f(b) are defined for the type of both a and b). That is currently true for bools: py> (True == 1) and (~True == ~1) True py> (False==0) and (~False == ~0) True You want ~b to return `not b`: py> (True == 1) and (False == ~1) False py> (False==0) and (True == ~0) False I see no upside and a serious downside to this proposal. -- Steve

Steven D'Aprano <steve@...> writes:
By anyone who takes booleans at face value (that is, takes booleans as representing a truth value and expects operations on booleans to reflect the semantics of useful operations on truth values, not some arbitrary side-effect of the internal representation of a boolean...). But I'm not surprised by such armchair commenting and pointless controversy on python-ideas, since that's what the list is for.... Regards Antoine.

On Thu, Apr 07, 2016 at 01:17:57PM +0000, Antoine Pitrou wrote:
Bools in Python have *always* been integers, so who are these people taking booleans at face value? Beginners? If so, say so. That's a motive I can understand. But I think that people who expect bools to be real truth values, like in Pascal, probably won't expect BITWISE operations to operate on them at all and will use the BOOLEAN operators and, or, not. Bitwise operators operate on a sequence of bits, not a single truth value. What do these naive "bools are truth values" people think: True << 3 should return? I don't think we need a second way to spell "not bool". Bools have always been ints in Python, and apart from their fancy string representation they behave like ints. I don't think it helps to make ~ a special case where they don't.
But I'm not surprised by such armchair commenting and pointless controversy on python-ideas, since that's what the list is for....
Thanks for your feedback, I'll give it the due consideration it deserves. -- Steve

On 7 April 2016 at 08:46, Antoine Pitrou <solipsis@pitrou.net> wrote:
This is all just a consequence of the (unfortunate IMO) design that treats True and False as being equivalent to 1 and 0.
This is a consequence of another unfortunate design by numpy. The reason for this is that numpy uses Python's bitwise operators to do element-wise logical operations. This is because it is not possible to overload Python's 'and', 'or', and 'not' operators. So if I write:
The problem is that Python tries to shortcut this expression and so it calls bool(1 < a):
Returning a non-bool from __bool__ is prohibited which implicitly prevents numpy from overloading an expression like `not a`:
Because of this numpy uses &,|,~ in place of and,or,not for numpy arrays and by extension this occurs for numpy scalars as you showed. I think the numpy project really wanted to use the normal Python operators but no mechanism was available to do it. It would have been nice to use &,|,~ for genuine bitwise operations (on e.g. unsigned int arrays). It also means that chained relations don't work e.g.:
The recommended way is (1 < a) & (a < 4) which not as nice and requires factoring a out if a is actually an expression like sin(x)**2 or something. -- Oscar

On 2016-04-07 13:19, Oscar Benjamin wrote:
On 7 April 2016 at 08:46, Antoine Pitrou <solipsis@pitrou.net> wrote:
This is not correct. & | ^ ~ are all genuine bitwise operations on numpy arrays.
What you are seeing with bool arrays is that bool arrays are *not* just uint8 arrays. Each element happens to take up a single 8-bit byte, but only one of those bits contributes to its value; the other 7 bits are mere padding. The bitwise & | ^ ~ operators all work on that single bit correctly. They do not operate on the padding bits as they are not part of the bool's value. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On 04/07/2016 12:46 AM, Antoine Pitrou wrote:
No. bool is a subclass of int, and changing that now would be a serious breach of backward-compatibility, not to mention breaking existing code for no good reason. Anyone who wants to can create their own Boolean class that doesn't subclass int and then declare the behaviour they want. If bool had been it's own thing from the start this wouldn't have been a problem, but it is far too late to change that now. You would be better off suggesting a new Logical type instead (it could even support unknown values).
__and__, __or__, and __xor__'s results in within bool's domain (right word?) so keeping them in the bool subtype makes sense; the result of __invert__ is not.
But I'm not surprised by such armchair commenting and pointless controversy on python-ideas, since that's what the list is for....
If you aren't going to be civil, don't bother coming back. -- ~Ethan~

Honestly I think that the OP has a point, and I don't think we have to bend over backwards to preserve int compatibility. After all str(True) != str(1), and surely there are other examples. -- --Guido van Rossum (python.org/~guido)

On 04/07/2016 09:38 AM, Guido van Rossum wrote:
I think the str() of a value, while possibly being the most interesting piece of information (IntEnum, anyone?), is hardly the most intrinsic. If we do make this change, besides needing a couple major versions to make it happen, will anything else be different? - no longer subclass int? - add an "unknown" value? - how will indexing work? - or any of the other operations? - don't bother with any of the other mathematical operations? - counting True's is not the same as adding True's I'm not firmly opposed, I just don't see a major issue here -- I've needed an Unknown value for more often that I've needed ~True to be False. -- ~Ethan~

On Apr 7, 2016, at 6:23 PM, Guido van Rossum <guido@python.org> wrote: Nothing else is on the table. Seriously. Stop hijacking the thread.
To clarify, the proposal is: ``~True == False`` but every other operation on ``True`` remains the same, including ``True * 42 == 42``. Correct?

On Thu, Apr 7, 2016 at 10:39 AM, Michael Selik <mike@selik.org> wrote:
Yes. To be more precise, there are some "arithmetic" operations (+, -, *, /, **) and they all treat bools as ints and always return ints; there are also some "bitwise" operations (&, |, ^, ~) and they should all treat bools as bools and return a bool. Currently the only exception to this idea is that ~ returns an int, so the proposal is to fix that. (There are also some "boolean" operations (and, or, not) and they are also unchanged.) -- --Guido van Rossum (python.org/~guido)

On 4/7/2016 2:08 PM, Guido van Rossum wrote:
On Thu, Apr 7, 2016 at 10:39 AM, Michael Selik <mike@selik.org> wrote:
When the proposal is expressed as "Make bools consistently follow this simple rule -- Logical and 'bitwise' operations on bools return the expected bool, while arithmetic operations treat bools as 0 or 1 and return ints.", it makes sense to me. Given that ~bool hardly make any sense currently, I would not expect it to be in much use now. Hence not much to break. -- Terry Jan Reedy

On Thu, Apr 7, 2016, at 16:05, Terry Reedy wrote:
Given that ~bool hardly make any sense currently, I would not expect it to be in much use now. Hence not much to break.
I suspect the fear is of one being passed into a place that expects an int, and staying alive as a bool (i.e. not being converted to an int by an arithmetic operation) long enough to confuse code that is trying to do ~int.

On 7 April 2016 at 21:08, Random832 <random832@fastmail.com> wrote:
That is indeed the only place likely to hit problems. But I'd be surprised if it was sufficiently common to be a major problem. I don't think the backward compatibility constraints on a minor release would preclude a change like this. Personally, I'm +0 on the proposal. It seems like a more useful behaviour, but it's one I'm never likely to need personally. Paul

Terry Reedy wrote:
Given that ~bool hardly make any sense currently, I would not expect it to be in much use now. Hence not much to break.
But conversely, any code that *is* using ~bool instead of "not bool" is probably doing it precisely because it *does* want the integer interpretation. Why break that code, when "not bool" is available as the obvious way of getting a logically negated bool? -- Greg

On Thu, Apr 7, 2016 at 11:54 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
It would be interesting actually if anyone has any idea of how to get some empirical data on this -- it isn't actually clear to me whether this is true or not. The reason I'm uncertain is that in numpy code, using operations like ~ on booleans is *very* common, because the whole idea of numpy is that it gives you a way to write code that works the same on either a single value or on an array of values, and when you're working with booleans then this means you have to use '~': '~' works on arrays and 'not' doesn't. And, for numpy bools or arrays of bools, ~ does logical negation: In [1]: ~np.bool_(True) Out[1]: False So you can write code like: # Contrived function that doubles 'value' if do_double is true # and otherwise halves it def double_or_halve(value, do_double): value = np.asarray(value) value[do_double] *= 2 value[~do_double] *= 0.5 return value and then this works correctly if 'do_double' is a numpy bool or array of bools: In [16]: double_or_halve(np.arange(3, dtype=float), np.array([True, False, True])) Out[16]: array([ 0. , 0.5, 4. ]) In [21]: double_or_halve(5.0, np.bool_(False)) Out[21]: array(2.5) But if you pass in a regular Python bool then the attempt to index by ~do_double turns into negative integer indexing and blows up: In [23]: double_or_halve(5.0, False) IndexError: too many indices for array Of course this is a totally contrived function, and anyway it has a bug -- the user should have said 'do_double = np.asarray(do_double)' at the top of the function, and that would fix the problem. This is definitely not some massive problem afflicting numerical users, and I don't have any strong opinion on Antoine's proposal. But, it is the only case where I can imagine someone intentionally writing ~bool, so it actually strikes me as plausible that the majority of existing code that writes ~bool is like this: doing it by mistake and expecting it to be the same as 'not'. FWIW. -n -- Nathaniel J. Smith -- https://vorpus.org

On Fri, Apr 8, 2016 at 4:08 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
It would, but you can't have a regular protocol for and/or because they're actually not operators, they're control-flow syntax: In [1]: a = True In [2]: a or b Out[2]: True In [3]: a and b NameError: name 'b' is not defined You could define a protocol for __or__/__ror__/__and__/__rand__ despite this, but it would have weird issues, like 'True and array([True, False])' would call ndarray.__rand__ and return array([True, False]), but 'True or array([True, False])' would return True (because it short-circuits and returns before it can even check for the presence of ndarray.__ror__). Given this, it's not clear whether it even makes sense to try. There was a discussion about this on python-ideas a few months ago, and Guido asked whether it would still be useful despite these weird issues, but I dropped the ball and my email to numpy-discussion soliciting feedback on that is still sitting in my drafts folder... And I guess you could have a protocol just for 'not', but there might be some performance concerns (e.g. right now the peephole optimizer actually knows how to optimize 'if not' into a single opcode), and overriding 'not' without overriding 'and' + 'or' is probably more confusing than useful. -n -- Nathaniel J. Smith -- https://vorpus.org

On Thu, Apr 07, 2016 at 11:08:38AM -0700, Guido van Rossum wrote:
You missed two: >> and <<. What are we to do with (True << 1)? Honestly, I cannot even imagine what it means to say "shift a truth value N bits". I think the idea that bitwise operations on bools are actually boolean operations in disguise is not a well-formed idea. Sometimes it happens to work out (& | ^), and sometimes it doesn't (<< and ~). And I'm not sure what to make of >> as an operation on bools. It doesn't *mean* anything, you can't shift a truth value, the very concept is meaningless, but if it did mean something it would surely return False. So >> could go into either category. But ultimately, ~ has meant bitwise-not for 25 years, and it's never caused a problem before, not back in the days when people used to write TRUE, FALSE = 1, 0 and not now. If you want to perform a boolean "not" on a truth value, you use `not`. Nobody cared enough to "fix" this (if it is a problem that needs fixing, which I doubt) when bools were first introduced, and nobody cared when Python 3 came out. So why are we talking about rushing a backwards- incompatible semantic change into a point release? Even if we "fix" this, surely we should go through the usual deprecation process? This isn't a critical security bug that needs fixing, it's a semantic change to something that has worked this way for 25 years, and its going to break something somewhere. There are just far too many people that expect that bools are ints. After all, not withstanding their fancy string representation, they behave like ints and actually are ints. -- Steve

Steven D'Aprano writes:
I'm with Steven on this. Knuth would call these operations "seminumerical". I would put the emphasis on "numerical", *expecting* True and False to be one-bit representations of the (mathematical) integers 1 and 0. If numerical operations widen bools to int and then operate, I would *expect* seminumerical operations to do so as well. In fact, I was startled by Antoine's post. I even have a couple of lines of code using "^" as a *logical* operator on known bools, carefully labeled "# Hack! works only on true bools." That said, I'm not Dutch, and if treating bool as "not actually int" here is the right thing to do, then I would think the easiest thing to do would be to interpret the bitwise operations as performed on (mythical) C "itty-bitty ints".[1] Then ~ does the right thing and True << 1 == True >> 1 == False << 1 == False >> 1 == 0 giving us four new ways to spell 0 as a bonus!
After all, not withstanding their fancy string representation,
I guess "fancy string representation" was the original motivation for the overrides. If the intent was really to make operator versions of logical operators (but only for true bools!), they would have fixed ~ too.
they behave like ints and actually are ints.
I can't fellow-travel all the way to "actually are", though. bools are what we decide to make them. I just don't see why the current behaviors of &|^ are particularly useful, since you'll have to guard all bitwise expressions against non-bool truthies and falsies. Footnotes: [1] "itty-bitty" almost reads "1 bit" in Japanese!

On Fri, Apr 08, 2016 at 02:52:45PM +0900, Stephen J. Turnbull wrote:
No need to guess. There's a PEP: https://www.python.org/dev/peps/pep-0285/
I'm not talking about bools in other languages, or bools in Python in some alternate universe. But in the Python we have right now, bools *are* ints, no ifs, buts or maybes: py> isinstance(True, int) True This isn't an accident of the implementation, it was an explicit BDFL pronouncement in PEP 285: 6) Should bool inherit from int? => Yes. Now I'll certainly admit that bools-are-ints is an accident of history. Had Guido been more influenced by Pascal, say, and less by C, he might have choosen to include a dedicated Boolean type right from the beginning. But he wasn't, and so he didn't, and consequently bools are now ints.
flag ^ flag is useful since we don't have a boolean-xor operator and bitwise-xor does the right thing for bools. And I suppose some people might prefer & and | over boolean-and and boolean-or because they're shorter and require less typing. I don't think that's a particularly good reason for using them, and as you say, you do have to guard against non-bools slipping, but Consenting Adults applies. -- Steve

On Fri, Apr 8, 2016 at 4:40 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Not everyone considers bit shifts 'bitwise', as they don't act at the level of individual bit positions: https://en.wikipedia.org/wiki/Bitwise_operation
One point of view is that bitwise operations should stay within bool, while shifts return ints, the left-shift operations actually being much more useful than "left-pad" ;). The main point of >> can be seen as consistency, although perhaps useless. That said, I don't really have an opinion on the OP's suggestion. -Koos

On Thu, Apr 7, 2016 at 6:40 PM, Steven D'Aprano <steve@pearwood.info> wrote:
You missed two: >> and <<. What are we to do with (True << 1)? [...]
Those are indeed ambiguous -- they are defined as multiplication or floor division with a power of two, e.g. x<<n is x*2**n and x>>n is x//2**n (for integral x and nonnegative n). The point of this thread seems to be to see whether some operations can be made more useful by staying in the bool domain -- I don't think making both of these return 0 if n != 0, so let's keep them unchanged.
The thing here is, this change is too small to warrant a __future__ import. So we're either going to introduce it in 3.6 and tell people about it in case their code might break, or we're never going to do it. I'm honestly on the fence, but I feel this is a rarely used operator so changing its meaning is not likely to break a lot of code. -- --Guido van Rossum (python.org/~guido)

On Fri, 8 Apr 2016 at 08:44 Guido van Rossum <guido@python.org> wrote:
DeprecationWarning every time you use ~ on a bool? That would still be too big a burden on using it the new way.
I think proposal would be a DeprecationWarning to flush out/remove all current uses of ~bool with Python 3.6, and then in Python 3.7 introduce the new semantics. -Brett

On 4/8/2016 11:42 AM, Guido van Rossum wrote:
DeprecationWarning every time you use ~ on a bool?
A DeprecationWarning should only be in the initial version of bool.__invert__, which initially would return int.__invert__ after issuing the warning that we plan to change the meaning. -- Terry Jan Reedy

On Sat, Apr 09, 2016 at 02:16:47PM +0300, Koos Zevenhoven wrote:
Maybe the right warning type would be FutureWarning.
If we accept this proposal -- and I hope we don't -- I think that FutureWarning is the right one to use. It is what was used in 2.3 when the behaviour of ints changed as part of int/long unification. -- Steve

On Sat, Apr 9, 2016 at 2:07 AM, Terry Reedy <tjreedy@udel.edu> wrote:
It seems unusual to deprecate something without also providing a means of using the new thing in the same release. "Don't use this feature because we're going to change what it does in the future. Oh, you want to use the new version? Psych! We haven't actually done anything yet. Use not instead." It creates a weird void in Python 3.6 where the operator still exists but absolutely nobody has a legitimate reason to be using it. What happens if somebody is using ~ for its current semantics, skips the 3.6 release in their upgrade path, and doesn't read the release notes carefully enough? They'll never see the warning and will just experience a silent and difficult-to-diagnose breakage.

On Sat, Apr 09, 2016 at 08:28:10AM -0600, Ian Kelly wrote:
Not really. This is quite similar to what happened in Python 2.3 during int/long unification. The behaviour of certain integer operations changed, including the meaning of some literals, and warnings were displayed. I don't have 2.3 available to demonstrate but I can show you the change in behaviour: [steve@ando ~]$ python1.5 -c "print 0xffffffff" -1 [steve@ando ~]$ python2.4 -c "print 0xffffffff" 4294967295 By memory, 0xffffffff in python2.3 would print a warning that the result will change in the next release, and return -1. See: https://www.python.org/dev/peps/pep-0237/ https://www.python.org/download/releases/2.3.5/notes/
Then they'll be in the same position as everybody if there's no depreciation at all. -- Steve

On Sat, Apr 9, 2016 at 9:25 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Pointing out that this has been done once before, 11 minor releases prior, does not dissuade me from continuing to characterize it as "unusual". The int/long unification was also a much more visible change overall.
I'm not suggesting there should be no deprecation. I'm just questioning whether the proposed deprecation is sufficient.

Let me pronounce something here. This change is not worth the amount of effort and pain a deprecation would cause everyone. Either we change this quietly in 3.6 (adding it to What's New etc. of course) or we don't do it at all. -- --Guido van Rossum (python.org/~guido)

On 10 April 2016 at 07:46, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I have no axe to grind either way, but my impression from this thread is that some people would prefer bool to be consistent with user-defined types (such as numpy's) in this regard - specifically because user-defined types *have* to use ~ as the negation operator because "not" is not overridable in they way they require. Paul

I'm +1 on this change because is makes sense as a user. Note how numpy deals with invert and unsigned integers: In [2]: a = np.uint8(10) In [3]: ~a Out[3]: 245 The result of invert staying within the same type makes sense to me. (Also, as an idealist, I believe that decoupling int and bool might one day many many years from now bring about the ideal of bool not subclassing int.) Best, Neil On Saturday, April 9, 2016 at 12:25:57 PM UTC-4, Guido van Rossum wrote:

Michael Selik wrote:
To clarify, the proposal is: ``~True == False`` but every other operation on ``True`` remains the same, including ``True * 42 == 42``. Correct?
Seems to me things are fine as they are. The justification for & and | on bools returning bools is that the result remains within the domain of bools, even when they are interpreted as int operations. But ~ on a bool-interpreted-as-an-int doesn't have that property, so ~True is more in the realm of True * 42 in that regard. -- Greg

On Thu, Apr 7, 2016 at 10:38 AM, Guido van Rossum <guido@python.org> wrote:
I can see it going either way: if we treat the domain of bool as that of the integers, then ~True == ~1 == -2. If on the other hand we treat it as the integers modulo 2, then it makes sense that ~True == ~1 == 0. But this would also imply that True + True == False, which would definitely break existing code. I note that if you add an explicit modulo division by 2, then it works out: py> ~True % 2 0 py> ~False % 2 1 The salient point to me is that there's no strong justification for making the change. As has been pointed out elsewhere in the thread, if you want binary not, just use not.

On 2016-04-07 08:15, Ethan Furman wrote:
Let's not forget that subclasses don't have to exactly duplicate all the behavior of their superclasses. That's why there's such a thing as overriding. Bool could remain a subclass of int, and still change its __invert__ behavior by overriding __invert__. It's true that this would be a backwards incompatible change, but behavior like ~True==-2 doesn't seem like something a lot of people are relying on. It would be worth looking into how much code actually does rely on it. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
participants (25)
-
Antoine Pitrou
-
Antoine Pitrou
-
Brendan Barnwell
-
Brett Cannon
-
Eric Snow
-
Ethan Furman
-
Greg Ewing
-
Guido van Rossum
-
Ian Kelly
-
Joseph Martinot-Lagarde
-
Koos Zevenhoven
-
Mark Dickinson
-
Michael Selik
-
Nathaniel Smith
-
Neil Girdhar
-
Niki Spahiev
-
Oscar Benjamin
-
Paul Moore
-
Pavol Lisy
-
Random832
-
Robert Kern
-
Serhiy Storchaka
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Terry Reedy