Mailman 3 Revisiting dedicated overloadable boolean operators - Python-ideas

Revisiting dedicated overloadable boolean operators

Todd

Aug. 3, 2018

5:46 p.m.

Coming back to the previous discussion about a new set of overloadable boolean operators [1], I have an idea for overloadable boolean operators that I think might work. The idea would be to define four new operators that take two inputs and return a boolean result based on them. This behavior can be overridden in appropriate dunder methods. These operators would have similar precedence to existing logical operators. The operators would be: bNOT - boolean "not" bAND - boolean "and" bOR - boolean "or" bXOR - boolean "xor" With corresponding dunder methods: __bNOT__ and _rbNOT__ (or __r_bNOT__) __bAND__ and _rbAND__ (or __r_bAND__) __bOR__ and _rbOR__ (or __r_bOR__) __bXOR__ and _rbXOR__ (or __r_bXOR__) The basic idea is that the "b" is short for "boolean", and we change the rest of the operator to upercase to avoid confusions with the existing operators. I think these operators would be preferably to the proposals so far (see [1] again) for a few reasons: 1. They are not easy to mistake with existing operators. They are clearly not similar to the existing bitwise operators like & or |, and although they are clearly related to the "not", "and", and "or" I think they are distinct enough that it should not be easy to confuse the two or accidentally use one in place of the other. 2. They are related to the operations they carry out, which is also an advantage over the existing bitwise operators. 3. The corresponding dunder methods (such as __bAND__ and _rbAND__) are obvious and not easily confused with anything else. 4. The unusual capitalization means they are not likely to be used much in existing Python code. It doesn't fall under any standard capitalization scheme I am aware of. 5. At least for english the capitalization means they are not easy to confuse with existing words. For example Band is a word, but it is not likely to be capitalized as bAND. As to why this is useful, the overall problem is that the current logical operators, like and, or, and not, cannot be overloaded, which means projects like numpy and SQLAlchemy instead have to (ab)use bitwise operators to define their own boolean operations (for example elementwise "and" in numpy arrays). This has a variety of problems, such not having appropriate precedence leading to precedence errors being common, and the simple fact that this precludes them from using the bitwise operators for bitwise operations. There was a proposal to allow overloading boolean operators in Pep-335 [2], but that PEP was rejected for a variety of very good reasons. I think none of those reasons (besides the conversation fizzling out) apply to my proposal. So the alternative proposal that has been floating around is to instead define new operators specifically for this. Although there seemed to be some support for this in principle, the actually operators so far have not met with much enthusiasm. So far the main operators proposed so far seem to be: 1. Double bitwise operators, such as && and ||. These have the disadvantage of looking like they should be a type of bitwise operator. 2. the existing operators, with some non-letter character at the front and back, like ".and.". These have the advantage that they are currently not valid syntax in most cases, but I think are too similar to existing logical operators, to easy to confuse, and it is not immediately obvious in what way they should differ from existing operators. They also mean different things in other languages. So I think my proposal addresses the main issues raised with existing proposals, but has the downside that it requires new keywords. Thoughts? [1] https://mail.python.org/pipermail/python-ideas/2015-November/037207.html [2] https://www.python.org/dev/peps/pep-0335/

Attachments:

attachment.htm (text/html — 4.5 KB)

Show replies by date

Jonathan Fine

August 2018

6:26 p.m.

Hi Todd Thank you for your contribution! I've got a couple of comments. The experts, I hope, will have more to say. You wrote:

...

As to why this is useful, the overall problem is that the current logical operators, like and, or, and not, cannot be overloaded, which means projects like numpy and SQLAlchemy instead have to (ab)use bitwise operator

...

There was a proposal to allow overloading boolean operators in Pep-335 [2], but that PEP was rejected for a variety of very good reasons.

The key thing is, I think, the wish for a domain specific language. I find this to be a wholesome wish. But I'd rather create a broad solution, than something that works just for special cases. And if at all possible, implement domain specific languages without extending the syntax and semantics of the language. This would benefit many existing users and projects now, without having to wait for the introduction of a new version of Python, and their project adopting that version. It may help to build on PEP 465 -- A dedicated infix operator for matrix multiplication https://www.python.org/dev/peps/pep-0465/ This addition allows you (from the PEP) to write

...

...
...
S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)

You my want to extend the syntax and semantics so that

...

...
...
S = (H @beta - r).T @ inv(H @ V @ H.T) @ (H @beta - r) invokes double-under methods, whose name might be something like __at_beta__

I'm impressed by

...

https://en.wikipedia.org/wiki/Fluent_interface https://martinfowler.com/bliki/FluentInterface.html and encourage work on tools for creating such in Python.

There is the problem of short-circuiting evaluation, as in the 'and' and 'or' operators (and elsewhere in Python). This has to be a syntax and semantics feature. It can't be controlled by the objects. However, as Steve Dower pointed out last month, we can use lamda for this purpose. I think it's easy to define a function OR such that the following

...

EXP_1 or EXP_2

...
OR(lambda: EXP_1, lambda:EXP_2) do pretty the same thing (except refactoring the expressions into the lambdas).

In fact, I think OR has to be

...

...
...
def OR(fn_1, fn_2): ... return fn_1() or fn_2()

I hope this help you solve the underlying problems, and have a better time with Python. -- Jonathan

Todd

7:17 p.m.

On Fri, Aug 3, 2018 at 2:26 PM, Jonathan Fine <jfine2358@gmail.com> wrote:

...

Hi Todd

Thank you for your contribution! I've got a couple of comments. The experts, I hope, will have more to say.

Thanks for your reply, Jonathan.

...

You wrote:

...
As to why this is useful, the overall problem is that the current logical operators, like and, or, and not, cannot be overloaded, which means projects like numpy and SQLAlchemy instead have to (ab)use bitwise operator

...
There was a proposal to allow overloading boolean operators in Pep-335 [2], but that PEP was rejected for a variety of very good reasons.

The key thing is, I think, the wish for a domain specific language. I find this to be a wholesome wish. But I'd rather create a broad solution, than something that works just for special cases. And if at all possible, implement domain specific languages without extending the syntax and semantics of the language.

This proposal isn't domain-specific. I think the fact that it would benefit projects as diverse as numpy and SQLAlchemy (as well as others such as sympy) demonstrates that. Boolean operators like the sort I am discussing have been a standard part of programming languages since forever. In fact, they are the basic operations on which modern microprocessors are built. The fact that Python, strictly speaking, doesn't have them is extremely unusual for a programming language. In many cases they aren't necessary in Python since Python's logical operators do the job well enough, but there are a set of highly diverse and highly prominent cases where those logical operators won't work. There are workarounds, but they are less than optimal for the reasons I describe, and the previous discussion I linked to goes into much more detail why these new operators are important. There is the problem of short-circuiting evaluation, as in the 'and'

...

and 'or' operators (and elsewhere in Python). This has to be a syntax and semantics feature. It can't be controlled by the objects.

Yes, sorry, I left that out. The consensus from the previous discussion is that this wouldn't be short-circuiting. I can imagine ways to support short-circuiting as well (such as a second set of dunder methods with one set overriding the other), but it isn't really relevant to my proposal. This proposal is assuming the semantics from the previous discussion. All I am trying to address here is how the operators would be spelled.

Steven D'Aprano

1:13 p.m.

On Fri, Aug 03, 2018 at 03:17:42PM -0400, Todd wrote:

...

I'm rather surprised at this claim. Can you give a survey of such overridable boolean operators which are available on modern microprocessors? What programming languages already have them? When you say "forever", are you going back to Fortran in the 1950s?

...

Can you list some of these diverse and highly prominent use-cases? I can think of two: - elementwise boolean operators, such as in numpy; - SQL-like DSL languages; plus a third rather specialised and obscure use-case: - implementing non-binary logical operators, for (e.g. ternary or fuzzy logic).

...

There are certainly advantages to using binary operators over named functions, and a shortage of good, ASCII punctuation suitable for new operators. I don't think much of your names bOR etc. I think that before adding more ad hoc binary operators, we ought to consider the possibility of custom operators. For example, Julia uses |> op <| https://github.com/JuliaLang/julia/issues/16985 (which I think is ugly and excessively verbose); Swift allows code to define custom prefix, infix or postfix operators: https://docs.swift.org/swift-book/LanguageGuide/AdvancedOperators.html https://docs.swift.org/swift-book/ReferenceManual/Declarations.html#//apple_... https://docs.swift.org/swift-book/ReferenceManual/LexicalStructure.html#ID41... Haskell is another language which supports custom infix operators: https://csinaction.com/2015/03/31/custom-infix-operators-in-haskell/ Here's a spur-of-the-moment suggestion: allow ~op for named infix operators. So: a ~foo b is *roughly* equivalent to: if hasattr(a, '__foo__'): return a.__foo__(b) elif hasattr(b, '__foo__'): return b.__rfoo__(a) else: raise TypeError Although possibly we might choose another pseudo-namespace, to avoid custom operators clashing with dunders. Trunders perhaps? (Triple underscores?) Under this scheme, your operators would become: ~or ~and ~xor and call trunders ___or___ etc. -- Steve

Dan Sommers

2:04 p.m.

On Sat, 04 Aug 2018 23:13:34 +1000, Steven D'Aprano wrote:

...

Hold that thoght. Then again, why is it 2018 (or 5778?) and we're still stuck with ASCII? Doesn't Unicode define a metric boatload of mathematical symbols? If Pythong allows Unicode names,¹ why not Unicode operators? ¹ No, I'm not going to call them variables. :-)

...

Great. Yet another way to spell a.foo(b). Or foo(a, b). :-/

...

And now mental gymnastics to jump from ~foo to ___foo___ or ___rfoo___. If it's too hard to tell = from == (see endless threads on this mailing list for proof), then it's also too hard to tell __xor__ from ___xor___. If I want to say a ~foo b then why can't I also say class A: def ~foo(self, b): pass # do something more useful here

Steven D'Aprano

5:23 p.m.

On Sat, Aug 04, 2018 at 02:04:01PM +0000, Dan Sommers wrote:

...

On Sat, 04 Aug 2018 23:13:34 +1000, Steven D'Aprano wrote:

...
There are certainly advantages to using binary operators over named functions, and a shortage of good, ASCII punctuation suitable for new operators.

Hold that thoght.

Then again, why is it 2018 (or 5778?) and we're still stuck with ASCII? Doesn't Unicode define a metric boatload of mathematical symbols? If Pythong allows Unicode names,¹ why not Unicode operators?

Some social problems: - allowing non-ASCII identifiers was controversial, and is still banned for the std lib; - according to critics of PEP 505, even ASCII operators like ?. are virtually unreadable or unspeakably ugly and "Perlish". If you think the uproar over PEP 572 was vicious, imagine what would happen if we introduced new operators like ∉ ∥ ∢ ∽ ⊎ etc instead. I'm not touching that hornet's nest with a twenty foot pole. And some technical problems: - keyboard support for entering the bulk of Unicode characters is non-existent or poor; - without keyboard support, editor support for entering Unicode characters is as best clunky, requiring the memorization of obscure names, hex codes, or a GUI palette; - and font support for the more exotic code points, including most mathematical operators, is generally rubbish. It may be that these technical problems will *never* be solved. But let other languages, like Julia, blaze this trail. [...]

...

...
I think that before adding more ad hoc binary operators, we ought to consider the possibility of custom operators [...]

a ~foo b

Great. Yet another way to spell a.foo(b). Or foo(a, b). :-/

Indeed. Technically, we don't need *any* operators at all, possibly aside from those that do argument short-circuiting. But for many purposes, we much prefer infix notation to prefix function notation. Which would you rather read and write? or(x, 1) x or 1 [...]

...

And now mental gymnastics to jump from ~foo to ___foo___ or ___rfoo___.

Just as we do "mental gymnastics" to jump from existing operators like + to __add__ or __radd__. If you don't like operator overloading *at all*, that ship has already sailed.

...

If it's too hard to tell = from == (see endless threads on this mailing list for proof) then it's also too hard to tell __xor__ from ___xor___.

*shrug* I don't think it is, but I'm open to alternative suggestions.

...

If I want to say

a ~foo b

then why can't I also say

class A: def ~foo(self, b): pass # do something more useful here

Infix operators delegate to a pair of methods. What would you call the second one? ~rfoo will clash with operator rfoo. We already have a convention that operators delegate to dunder methods, and I see no reason to make changes to that convention. It's a *good* convention. The smaller the number of changes needed for a proposal, the better its chances of being accepted. My suggestion requires: - one new piece of syntax, ~op or equivalent, as a binary operator; - (possibly) one slight extension to an existing naming convention; - (possibly) one new byte-code; - no new keywords, no new syntax for methods, no new built-in types, no changes to the execution model of the language, and no changes to the characters allowed in Python code. If you want to make a counter-proposal that is more extensive, be my guest :-) -- Steve

David Mertz

6:03 p.m.

On Sat, Aug 4, 2018, 1:24 PM Steven D'Aprano <steve@pearwood.info> wrote:

...

This is the essential problem. I write this as someone who has the vim conceal plugin configured to change my Python code into something with many of those funny characters Steven users as examples. But while I like looking at those, although recognizing it's quirky, entering any of them is enormously cumbersome. I did it once to configure my substitution macros (which are purely visual... What I type in is just ASCII and that's what is saved in soak, but it appears on screen with some replacements).

...

Stephan Houben

6:15 p.m.

I use these Vim abbreviations, which are derived from LaTeX https://gist.github.com/stephanh42/fc466e62bfb022a890ff2c4643eaf3a5 Stephan Op za 4 aug. 2018 20:03 schreef David Mertz <mertz@gnosis.cx>:

...

David Mertz

6:23 p.m.

That's very nice Stephan. I think I'll add those to my machines. But not everyone uses vim. And although I use vim a lot, I also use other text editors, by choice or compulsion (e.g. editing code in web interface). And even when I have vim, I don't necessarily have the ability to install and test a .vimrc on the machines I use. There are probably analogous ways to add sequence bindings in other text editors. But until or unless such configurations are universal across operating systems and editors, this offers little help for hypothetical Unicode operators in Python. On Sat, Aug 4, 2018, 2:15 PM Stephan Houben <stephanh42@gmail.com> wrote:

...

Benedikt Werner

2:56 p.m.

...

I think it would make sense to instead use a new keyword to define operators. Maybe something like "defop". I don't think that's a very common variable or function name. Example syntax could be: class MyInt(int): defop combine(self, other): return self, other # and now we can use it x combine y # which is equivalent to x.combine(y) Of course it would actually have to check if and where the operator is defined like currently done for overloading. Also it might be worth considering support for pre/postfix operators, maybe either with a differnt keyword (something like defpre/defprefix, don't really like those but something similar maybe) or a symbol as indicator (e.g. "defop <combine(..)" for prefix and "defop >combine(..)" for postifx).

Steven D'Aprano

4:37 p.m.

On Sat, Aug 04, 2018 at 04:56:56PM +0200, Benedikt Werner wrote:

...

Thanks :-) Unfortunately there's a flaw, in that the ~ symbol already means unary bitwise-not, so we can't use ~op for operators. Throwing some ideas out to be shot down: spam !op eggs spam :op eggs spam @op eggs spam ..op eggs

...

Three underscores is 50% longer than two. I don't believe that it is harder to tell the difference between ___ and __ at a glance than it is to tell the difference between __ and _ at a glance, even for those with a mild visual impairment like mine. Unless you're reading and writing code using a proportional font, in which case I have no sympathy. Whatever naming convention we use, it should be a subset of dunders, different from all existing dunders, short enough to avoid being annoying to use, and make up an obviously distinct group. How about this? __o_name__ __o_rname__

...

New keywords should be a last resort, only for functionality which requires syntactic support. This doesn't. What will this "defop" keyword do that def doesn't already do? Probably nothing, the methods will be ordinary methods just like __add__ and other operator dunders. And even if we needed some sort of extra functionality, say, registering the operators, we could use a decorator for that.

...

That would mean giving up the ability to detect a whole lot of syntax errors at compile time. Remember that under Python's execution model, the compiler cannot tell in advance which custom operators have been defined and which have not. It has to resolve them at runtime. So the only way we could allow `x combine y` as valid syntax would be if we also allowed errors like `print len alist` as valid syntax. This is why I think that named operators should require a special prefix. Without the prefix, x combine y is a syntax error. But with the prefix, say, x :combine y, the compiler can use a custom byte-code to resolve the operator at runtime. (Of course it will still be a runtime error if neither x nor y define the combine operator.) All this supposes that there is sufficient benefit to allowing custom infix operators, including overridable or/and/xor, which is not yet shown. -- Steve

Chris Angelico

5:07 p.m.

On Sun, Aug 5, 2018 at 2:37 AM, Steven D'Aprano <steve@pearwood.info> wrote:

...

Part of the justification for that is that the bitwise operators have different precedence to the logical operators. But custom operators would have to all be grouped at the same precedence level (or maybe a small handful of precedences, chosen by syntax), so that won't truly solve that problem. A valid justification would be: A single object needs to be able to perform both bitwise and logical operations, AND needs to customize the logical ops. I haven't seen any but they could exist. ChrisA

Todd

6:40 p.m.

On Sat, Aug 4, 2018 at 9:13 AM, Steven D'Aprano <steve@pearwood.info> wrote:

...

Sorry I wasn't clear, I didn't mean overloadable boolean operators are standard, but rather boolean operators in general. I was trying to point out that there is nothing domain-specific about boolean operators.

...

Also symbolic mathematics like in sympy. That is three.

...

I am personally very strongly against custom operators. I just have visions of someone not liking how addition works for some particular class and deciding implementing a "＋" operator would be a great idea.

Chris Angelico

6:48 p.m.

On Sun, Aug 5, 2018 at 4:40 AM, Todd <toddrjen@gmail.com> wrote:

...

You say that Python doesn't have them. What aspect of boolean operators doesn't Python have?

...

Eww. (Before anyone jumps in and says "uhh you already have __add__", that is *not* U+002B PLUS SIGN, it is U+FF0B FULLWIDTH PLUS SIGN, which would indeed be a custom operator.) But ultimately, there is already nothing stopping people from doing this: def Ien(obj): """Return object size in machine words""" return sys.getsizeof(obj) // (sys.maxsize.bit_length() + 1) and mixing and matching that with the built-in len function. Give people freedom, and some will abuse it horrifically... but others will use it usefully and safely. ChrisA

Todd

1:16 a.m.

On Sat, Aug 4, 2018 at 2:48 PM, Chris Angelico <rosuav@gmail.com> wrote:

...

Python's "and" and "or" don't return "True" or "False" per se, they return one of the inputs based on their respective truthiness. So although they are logical operators, they are not strictly boolean operators.

...

In your example, you are intentionally picking a character purely because it happens to look similar to a completely different character. That isn't the sort of thing that can happen innocently or by accident. By contrast, using a valid mathematical symbol for the corresponding mathematical operation is exactly the sort of thing allowing new operators is meant to support. The fact that this symbol happens to look similar in some fonts to the normal plus operator is something that may not even occur to the person who chose to use the operator. It would likely seem obvious to the developer at the time. So although we can't generally prevent people from being actively malicious, I think we should at least try to avoid making it overly easy to make code an unreadable mess. And allowing custom operators seems to me to make it way too easy to produce an unreadable mess.

Steven D'Aprano

2:28 a.m.

On Sat, Aug 04, 2018 at 09:16:35PM -0400, Todd wrote: [Chris said:]

...

According to Python's rules for truthiness, they are boolean operators. According to Python's rules, True and False aren't the only boolean values. They're merely the canonical true and false values, but otherwised unprivileged. So I don't think this is a difference that makes any real difference. You might as well complain that Python doesn't strictly have ints, because some other languages limit their ints to 32 or 64 bits, and Python doesn't. But either way, this isn't a really important factor. If we add overridable "boolean operators" like bOR and bAND, the fact that they can be overridden means that they won't be limited to returning True and False either: - numpy elementwise operators will return arrays; - sympy will return symbolic expressions; - ternary logic will return trits (say, true/false/maybe); etc. So the question of Python truthiness is not really relevant. [...]

...

I see lots of newbies, and experienced coders who ought to know better, using variables like l and sometimes even O. Don't underestimate the power of laziness and thoughtlessness. On the other hand, such poor choices are easily fixed with a gentle or not-so-gentle application of the Clue Bat and a bit of minor refactoring. Changing variable names is easy. Likewise, if somebody chooses an ugly custom operator like O01l it isn't hard to refactor it to something more meaningful.

...

The term "strawman fallacy" gets misused a lot on the internet, mostly by people who use it as a short-hand for: Dammit, you've just found the flaw in my argument I didn't notice, so I'll try to distract attention by falsely accusing you of a fallacy. But your comments about symbols like ＋ truly are a strawman: Substituting a person’s actual position or argument with a distorted, exaggerated, or misrepresented version of the position of the argument. https://www.logicallyfallacious.com/tools/lp/Bo/LogicalFallacies/169/Strawma... I never proposed supporting arbitrary Unicode symbols like ＋ (full width plus sign), in fact the opposite, I explicitly ruled it out. In response to a question about supporting Unicode operators, I said "I'm not touching that hornet's nest with a twenty foot pole." and listed a number of social and technical reasons for not supporting Unicode operators. I said that the operators would have to be legal identifiers. So no, operators like ∉ ∥ ∢ ∽ ⊎ and ＋ are not an option under my proposal. -- Steve

Grégory Lielens

5:13 a.m.

You should have a look at old PEP225 "Elementwise operators", that proposed ~ as a modifier for many existing operator, to indicate they do mostly what their "normal counterpart do, but act on the inner/ elements of an object instead of on the whole. This was only a memorisation technique, as all tilde operators were defined individually and has associated magic/dunder method (they had also the same precedence as the non-tilded version). It was mainly for ~* to be used as classic numpy multiply and * used as matrix multiply, but as an aside, ~and, ~or, ~not and ~xor were defined as elementwise logical operators, bitwise when applied on int-like objects. Also xor was proposed as a new short-circuiting Classic operator, for orthogonality (the value returned when doing true_a xor true_b was not fixed in the PEP, I can not decide between False and None ;-) Funny this come back after all this time

Grégory Lielens

5:36 a.m.

Obviously I'm +1 on this, but a little bit less so than at the time of proposal, let's say +0.8...at PEP 225 time, @ matmul operator did not exist (it was the competing PEP 211, also to address matrix multiply, that proposed @...both were rejected at the time lol). But now that @ exists, there would be either redundancy or lack of orthogonality among the family of multiplication infix operators...

Steven D'Aprano

1:48 a.m.

On Sat, Aug 04, 2018 at 02:40:54PM -0400, Todd wrote:

...

Right -- and Python has such common boolean operators. It isn't clear that there's much need for xor, nand, nor, etc. (There are a grand total of 16 distinct boolean operators which take two operands, but few of them are useful except under very specialised circumstances.) [I asked:]

...

I don't think symbolic mathematics is "highly prominent" (your words). I would consider it in the same category as fuzzy logic: specialised and unusual. To my mind, this basically means there are two important use-cases: - numpy and elementwise boolean operators; - SQL-like queries; and a couple of more specialised uses.

...

I have visions of someone not liking how boolean operators `or` and `and` work for some particular class and deciding that overridable boolean operators would be a great idea. Under my proposal, you couldn't invent new symbolic operators like ＋. Operators would be limited to legal identifiers, so people can do no worse than they can already do for method names, e.g. ugly names like "bOR" or "bAND". Given this proposal, your overridable boolean operators are instantly available, and using the proper names "or" and "and". There's no ambiguity, because the custom operators will always require a prefix (I suggested ~ but that won't work, perhaps ! or @ will work). And the benefit is that you don't have to come back next year with another PEP to introduce bNAND and bNOR operators. -- Steve

Nicholas Chammas

8:02 p.m.

On Fri, Aug 3, 2018 at 1:47 PM Todd toddrjen@gmail.com <http://mailto:toddrjen@gmail.com> wrote: The operators would be:

...

These look pretty ugly to me. But that could just be a matter of familiarity. For what it’s worth, the Apache Spark project offers a popular DataFrame API for querying tabular data, similar to Pandas. The project overloaded the bitwise operators &, |, and ~ since they could not override the boolean operators and, or, and not. For example: non_python_rhode_islanders = ( person .where(~person['is_python_programmer']) .where(person['state'] == 'RI' & person['age'] > 18) .select('first_name', 'last_name') ) non_python_rhode_islanders.show(20) This did lead to confusion among users <https://issues.apache.org/jira/browse/SPARK-8568> since people (myself included) would initially try the boolean operators and wonder why they weren’t working. So the Spark devs added a warning <https://github.com/apache/spark/pull/6961/files> to catch when users were making this mistake. But now it seems quite OK to me to use &, |, and ~ in the context of Spark DataFrames, even though their use doesn’t match their designed meaning. It’s unfortunate, but I think the Spark devs made a practical choice that works well enough for their users. PEP 335 would have addressed this issue by letting developers overload the common boolean operators directly, but from what I gather of Guido’s rejection <https://mail.python.org/pipermail/python-dev/2012-March/117510.html>, the biggest problem was that it would have had an undue performance impact on non-users of boolean operator overloading. (Not sure if I interpreted his email correctly.)

Chris Barker

9:38 p.m.

On Fri, Aug 3, 2018 at 1:02 PM, Nicholas Chammas <nicholas.chammas@gmail.com

...

wrote:

...

The project overloaded the bitwise operators &, |, and ~ since they could not

override the boolean operators and, or, and not.

...

I actually think that is a good solution to this problem -- the fact is

that for most data types bitwise operators are useless -- and for even more not-very-useful. numpy did not do this, because, as it happens, bitwise operators can be useful for numpy arrays of integers (though as I write this, bitwise operations really aren't that common -- maybe requiring a function call for them would be a good way to go -- too late now). Also, in a common use-case, bitwise-and behaves the same as logical_and, e.g. if (arr > x) & (arr2 == y) This "works" because both arrays being bitwise-anded are boolean arrays. So you really don't need to call: np.logical_and and friends very often. so -1 on yet another set of operartors. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Todd

4:13 a.m.

On Fri, Aug 3, 2018 at 5:38 PM, Chris Barker <chris.barker@noaa.gov> wrote:

...

There are a few problems with using the bitwise operators. First, and most important in my opinion, is that the precedence is significantly off from that of the logical operators. As your own example shows, any non-trivial example requires a lot of parentheses to keep things working. And if you are switching back and forth between, say, array logical operations and "normal" logical operations it is easy to mess up. Second is that it can be restricted to only working on boolean-like data types. Combining how easy it is to get the precedence wrong with the fact that getting it wrong can silently fail is not a good combination in my opinion. Making sure the operator is actually doing what people want and expect it to do seems like a big benefit to me. Third is that it allows both boolean and bitwise operations to be carried out on the same data types. Numpy is a special case where the two basically are equivalent if you are working with boolean arrays. But that is a special case.

Chris Barker

6:11 p.m.

On Fri, Aug 3, 2018 at 9:13 PM, Todd <toddrjen@gmail.com> wrote:

...

...
Also, in a common use-case, bitwise-and behaves the same as logical_and, e.g.

if (arr > x) & (arr2 == y)

This "works" because both arrays being bitwise-anded are boolean arrays.

...

There are a few problems with using the bitwise operators.

First, and most important in my opinion, is that the precedence is significantly off from that of the logical operators.

yes, that's true, and perhaps too bad, but as they are spelled differently, not a killer. if you are switching back and forth between, say, array logical operations

...

and "normal" logical operations it is easy to mess up.

well, as you generally are working with arrays or not, again, not too bad.

...

Third is that it allows both boolean and bitwise operations to be carried out on the same data types. Numpy is a special case where the two basically are equivalent if you are working with boolean arrays. But that is a special case.

I kind of muddled my point -- the main trust was that overloading the bitwise operators to do logical operations is a fine idea -- many objects will have no or limited use for bitwise operations. In fact, if I were to re-design the numpy API, I would overload the bitwise operators to do logic, and use the special functions for bitwise operations: np.bitwise_and etc. rather than having to use logical_and and friends the way we do now. So any new class that doesn't already make use of the bitwise operators can do that. (yes, still the precedence issue, but what can you do?) -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Chris Barker

6:14 p.m.

On Mon, Aug 6, 2018 at 11:11 AM, Chris Barker <chris.barker@noaa.gov> wrote:

...

So any new class that doesn't already make use of the bitwise operators can do that.

just like set() -- which I think has been mentioned here already. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Benedikt Werner

10:05 p.m.

...

As I see it this proposal only proposes a different syntax and doesn't solve this problem. The only real solution for this would be a new set of operators but I agree with Chris that overriding the bitwise operators is good enough for most cases and a new set of operators really is a bit over the top just for this. I especially dislike using || and && as they are prominently used in other programming languages and this would be extremely confusing for newcomers from those languages. Also if the syntax isn't clear and consice I feel it doesn't really add any value as the main point of operator overloading is to make code easy to read and understand. This really only would be the case if we could overload the boolean operators. Otherweise I think using a function or overloading the bitwise ops is the best solution.

MRAB

1:17 a.m.

On 2018-08-03 23:05, Benedikt Werner wrote:

...

[snip] I've been re-reading PEP 335 and I think that the __and1__ method isn't needed. The __bool__ method is called anyway, and currently must return either False or True, but what if it could return the special value NeedOtherOperand mentioned in the PEP? The disadvantage would be that if the first operand is a bool, the operator could still short-circuit, and I'm not sure how much of an issue that would be.

Todd

4 a.m.

On Fri, Aug 3, 2018 at 6:05 PM, Benedikt Werner <1benediktwerner@gmail.com> wrote:

...

The proposal is for new operators. The operators would be "bNOT", "bAND", "bOR", and "bXOR". They would be completely independent of the existing "not", "and", and "or" operators, operating purely on boolean values. It would be possible to overload these operators.

Benedikt Werner

5:31 a.m.

...

I guess having overloadable operators with proper precedences would be quite handy for fluent style APIs but I don't think it's worth justifying a new set of operators.

Neil Girdhar

9:44 p.m.

This doesn't work because the logical Boolean operators short circuit in Python. So you could not even define these operators for the regular Python types. Your two examples numpy and SQLAlchemy don't want this short-circuiting behavior, so you would never want to write anything like (some_array or some_other_array) The reader of this code might imagine that there is some short circuiting or conversion to Boolean. The essential problem here is that you want a nicer way to create symbolic graphs with Boolean operators. But the general problem is that Python always has wrinkles when creating symbolic graphs. Besides numpy conditions and SQLAlchemy, sympy and tensorflow also build symbolic graphs. They also struggle with succinctness. You will never get the symbolic graph to look just like pseudocode the way pure Python does. On Friday, August 3, 2018 at 1:48:02 PM UTC-4, Todd Jennings wrote:

...

Coming back to the previous discussion about a new set of overloadable boolean operators [1], I have an idea for overloadable boolean operators that I think might work. The idea would be to define four new operators that take two inputs and return a boolean result based on them. This behavior can be overridden in appropriate dunder methods. These operators would have similar precedence to existing logical operators. The operators would be:

bNOT - boolean "not" bAND - boolean "and" bOR - boolean "or" bXOR - boolean "xor"

With corresponding dunder methods:

__bNOT__ and _rbNOT__ (or __r_bNOT__) __bAND__ and _rbAND__ (or __r_bAND__) __bOR__ and _rbOR__ (or __r_bOR__) __bXOR__ and _rbXOR__ (or __r_bXOR__)

The basic idea is that the "b" is short for "boolean", and we change the rest of the operator to upercase to avoid confusions with the existing operators. I think these operators would be preferably to the proposals so far (see [1] again) for a few reasons:

1. They are not easy to mistake with existing operators. They are clearly not similar to the existing bitwise operators like & or |, and although they are clearly related to the "not", "and", and "or" I think they are distinct enough that it should not be easy to confuse the two or accidentally use one in place of the other.

2. They are related to the operations they carry out, which is also an advantage over the existing bitwise operators.

3. The corresponding dunder methods (such as __bAND__ and _rbAND__) are obvious and not easily confused with anything else.

4. The unusual capitalization means they are not likely to be used much in existing Python code. It doesn't fall under any standard capitalization scheme I am aware of.

5. At least for english the capitalization means they are not easy to confuse with existing words. For example Band is a word, but it is not likely to be capitalized as bAND.

As to why this is useful, the overall problem is that the current logical operators, like and, or, and not, cannot be overloaded, which means projects like numpy and SQLAlchemy instead have to (ab)use bitwise operators to define their own boolean operations (for example elementwise "and" in numpy arrays). This has a variety of problems, such not having appropriate precedence leading to precedence errors being common, and the simple fact that this precludes them from using the bitwise operators for bitwise operations.

There was a proposal to allow overloading boolean operators in Pep-335 [2], but that PEP was rejected for a variety of very good reasons. I think none of those reasons (besides the conversation fizzling out) apply to my proposal.

So the alternative proposal that has been floating around is to instead define new operators specifically for this. Although there seemed to be some support for this in principle, the actually operators so far have not met with much enthusiasm. So far the main operators proposed so far seem to be:

1. Double bitwise operators, such as && and ||. These have the disadvantage of looking like they should be a type of bitwise operator.

2. the existing operators, with some non-letter character at the front and back, like ".and.". These have the advantage that they are currently not valid syntax in most cases, but I think are too similar to existing logical operators, to easy to confuse, and it is not immediately obvious in what way they should differ from existing operators. They also mean different things in other languages.

So I think my proposal addresses the main issues raised with existing proposals, but has the downside that it requires new keywords.

Thoughts?

[1] https://mail.python.org/pipermail/python-ideas/2015-November/037207.html [2] https://www.python.org/dev/peps/pep-0335/

Steven D'Aprano

1:02 a.m.

On Mon, Aug 06, 2018 at 02:44:24PM -0700, Neil Girdhar wrote:

...

Todd is not proposing to add dunder methods for the existing "or" and "and" operators. Todd is proposing four new operators spelled "bAND", "bOR", "bXOR" and "bNOT", which aren't short-circuiting and call dunder methods, just like other operators including "in". You seem to be saying that "this" (defining new operators that call dunder methods) doesn't work because a set of *completely different* existing operators short-circuit. If that's not what you meant, I don't understand what you actually did mean.

...

Fortunately Todd isn't proposing that. -- Steve

Neil Girdhar

1:14 a.m.

Oh, I see, I thought he wanted to override the original logical operators. I don't like adding more operators just to make symbolic equation generation simpler. I think keeping the language simple and using the "numpy.logical_and" function is better than making the language more complicated for a small fraction of users. There will always be a gap between Python and symbolic equation generation. On Mon, Aug 6, 2018 at 9:03 PM Steven D'Aprano <steve@pearwood.info> wrote:

...

Grégory Lielens

4:38 a.m.

A small remark for Todd's proposal: I think you should treat the new not (bNOT in the original proposal) differently: it's not binary, so it should not have 2 dunders, the right one is not needed (or there is only the right one, in a way, but other unary ops use the classic dunder iirc...) Also, not having xor is made more painful by this proposal (or for any proposal for new Boolean operators using variants of and/or/not)... I have been bitten a few times writing xor in my code (not often, because xor is done less often), it already feel like it's missing from python. With additional duplicated operators, including bXOR, the missing xor is annoying like a missing teeth: even if you don't use it so much, you think of it all the time ;-) Greg.

Michel Desmoulin

5:04 p.m.

Adding one operator is hard in Python. Adding 4 operators, just for the sake of a bit of syntaxic suggar for DSL based projects is never going to fly. And I say that as a long time SQLA user. Le 03/08/2018 à 19:46, Todd a écrit :

...

Jonathan Fine

August 2018

6:26 p.m.

Hi Todd Thank you for your contribution! I've got a couple of comments. The experts, I hope, will have more to say. You wrote:

...

As to why this is useful, the overall problem is that the current logical operators, like and, or, and not, cannot be overloaded, which means projects like numpy and SQLAlchemy instead have to (ab)use bitwise operator

...

There was a proposal to allow overloading boolean operators in Pep-335 [2], but that PEP was rejected for a variety of very good reasons.

...

...
...
S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)

You my want to extend the syntax and semantics so that

...

...
...
S = (H @beta - r).T @ inv(H @ V @ H.T) @ (H @beta - r) invokes double-under methods, whose name might be something like __at_beta__

I'm impressed by

...

https://en.wikipedia.org/wiki/Fluent_interface https://martinfowler.com/bliki/FluentInterface.html and encourage work on tools for creating such in Python.

...

EXP_1 or EXP_2

...
OR(lambda: EXP_1, lambda:EXP_2) do pretty the same thing (except refactoring the expressions into the lambdas).

In fact, I think OR has to be

...

...
...
def OR(fn_1, fn_2): ... return fn_1() or fn_2()

I hope this help you solve the underlying problems, and have a better time with Python. -- Jonathan

Todd

7:17 p.m.

On Fri, Aug 3, 2018 at 2:26 PM, Jonathan Fine <jfine2358@gmail.com> wrote:

...

Hi Todd

Thank you for your contribution! I've got a couple of comments. The experts, I hope, will have more to say.

Thanks for your reply, Jonathan.

...

You wrote:

...
As to why this is useful, the overall problem is that the current logical operators, like and, or, and not, cannot be overloaded, which means projects like numpy and SQLAlchemy instead have to (ab)use bitwise operator

...
There was a proposal to allow overloading boolean operators in Pep-335 [2], but that PEP was rejected for a variety of very good reasons.

The key thing is, I think, the wish for a domain specific language. I find this to be a wholesome wish. But I'd rather create a broad solution, than something that works just for special cases. And if at all possible, implement domain specific languages without extending the syntax and semantics of the language.

...

and 'or' operators (and elsewhere in Python). This has to be a syntax and semantics feature. It can't be controlled by the objects.

Steven D'Aprano

1:13 p.m.

On Fri, Aug 03, 2018 at 03:17:42PM -0400, Todd wrote:

...

Dan Sommers

2:04 p.m.

On Sat, 04 Aug 2018 23:13:34 +1000, Steven D'Aprano wrote:

...

Great. Yet another way to spell a.foo(b). Or foo(a, b). :-/

...

Steven D'Aprano

5:23 p.m.

On Sat, Aug 04, 2018 at 02:04:01PM +0000, Dan Sommers wrote:

...

On Sat, 04 Aug 2018 23:13:34 +1000, Steven D'Aprano wrote:

...
There are certainly advantages to using binary operators over named functions, and a shortage of good, ASCII punctuation suitable for new operators.

Hold that thoght.

Then again, why is it 2018 (or 5778?) and we're still stuck with ASCII? Doesn't Unicode define a metric boatload of mathematical symbols? If Pythong allows Unicode names,¹ why not Unicode operators?

...

...
I think that before adding more ad hoc binary operators, we ought to consider the possibility of custom operators [...]

a ~foo b

Great. Yet another way to spell a.foo(b). Or foo(a, b). :-/

...

And now mental gymnastics to jump from ~foo to ___foo___ or ___rfoo___.

Just as we do "mental gymnastics" to jump from existing operators like + to __add__ or __radd__. If you don't like operator overloading *at all*, that ship has already sailed.

...

If it's too hard to tell = from == (see endless threads on this mailing list for proof) then it's also too hard to tell __xor__ from ___xor___.

*shrug* I don't think it is, but I'm open to alternative suggestions.

...

If I want to say

a ~foo b

then why can't I also say

class A: def ~foo(self, b): pass # do something more useful here

David Mertz

6:03 p.m.

On Sat, Aug 4, 2018, 1:24 PM Steven D'Aprano <steve@pearwood.info> wrote:

...

Stephan Houben

August 2018

6:15 p.m.

I use these Vim abbreviations, which are derived from LaTeX https://gist.github.com/stephanh42/fc466e62bfb022a890ff2c4643eaf3a5 Stephan Op za 4 aug. 2018 20:03 schreef David Mertz <mertz@gnosis.cx>:

...

David Mertz

6:23 p.m.

...

Benedikt Werner

2:56 p.m.

...

Steven D'Aprano

4:37 p.m.

On Sat, Aug 04, 2018 at 04:56:56PM +0200, Benedikt Werner wrote:

...

Chris Angelico

5:07 p.m.

On Sun, Aug 5, 2018 at 2:37 AM, Steven D'Aprano <steve@pearwood.info> wrote:

...

Todd

6:40 p.m.

On Sat, Aug 4, 2018 at 9:13 AM, Steven D'Aprano <steve@pearwood.info> wrote:

...

Also symbolic mathematics like in sympy. That is three.

...

Chris Angelico

August 2018

6:48 p.m.

On Sun, Aug 5, 2018 at 4:40 AM, Todd <toddrjen@gmail.com> wrote:

...

You say that Python doesn't have them. What aspect of boolean operators doesn't Python have?

...

Todd

1:16 a.m.

On Sat, Aug 4, 2018 at 2:48 PM, Chris Angelico <rosuav@gmail.com> wrote:

...

Steven D'Aprano

2:28 a.m.

On Sat, Aug 04, 2018 at 09:16:35PM -0400, Todd wrote: [Chris said:]

...

Grégory Lielens

5:13 a.m.

Grégory Lielens

5:36 a.m.

Steven D'Aprano

1:48 a.m.

On Sat, Aug 04, 2018 at 02:40:54PM -0400, Todd wrote:

...

Nicholas Chammas

August 2018

8:02 p.m.

On Fri, Aug 3, 2018 at 1:47 PM Todd toddrjen@gmail.com <http://mailto:toddrjen@gmail.com> wrote: The operators would be:

...

Chris Barker

9:38 p.m.

On Fri, Aug 3, 2018 at 1:02 PM, Nicholas Chammas <nicholas.chammas@gmail.com

...

wrote:

...

The project overloaded the bitwise operators &, |, and ~ since they could not

override the boolean operators and, or, and not.

...

I actually think that is a good solution to this problem -- the fact is

Todd

4:13 a.m.

On Fri, Aug 3, 2018 at 5:38 PM, Chris Barker <chris.barker@noaa.gov> wrote:

...

Chris Barker

6:11 p.m.

On Fri, Aug 3, 2018 at 9:13 PM, Todd <toddrjen@gmail.com> wrote:

...

...
Also, in a common use-case, bitwise-and behaves the same as logical_and, e.g.

if (arr > x) & (arr2 == y)

This "works" because both arrays being bitwise-anded are boolean arrays.

...

There are a few problems with using the bitwise operators.

First, and most important in my opinion, is that the precedence is significantly off from that of the logical operators.

yes, that's true, and perhaps too bad, but as they are spelled differently, not a killer. if you are switching back and forth between, say, array logical operations

...

and "normal" logical operations it is easy to mess up.

well, as you generally are working with arrays or not, again, not too bad.

...

Third is that it allows both boolean and bitwise operations to be carried out on the same data types. Numpy is a special case where the two basically are equivalent if you are working with boolean arrays. But that is a special case.

Chris Barker

6:14 p.m.

On Mon, Aug 6, 2018 at 11:11 AM, Chris Barker <chris.barker@noaa.gov> wrote:

...

So any new class that doesn't already make use of the bitwise operators can do that.

Benedikt Werner

10:05 p.m.

...

MRAB

August 2018

1:17 a.m.

On 2018-08-03 23:05, Benedikt Werner wrote:

...

Todd

4 a.m.

On Fri, Aug 3, 2018 at 6:05 PM, Benedikt Werner <1benediktwerner@gmail.com> wrote:

...

Benedikt Werner

5:31 a.m.

...

I guess having overloadable operators with proper precedences would be quite handy for fluent style APIs but I don't think it's worth justifying a new set of operators.

Neil Girdhar

9:44 p.m.

...

Coming back to the previous discussion about a new set of overloadable boolean operators [1], I have an idea for overloadable boolean operators that I think might work. The idea would be to define four new operators that take two inputs and return a boolean result based on them. This behavior can be overridden in appropriate dunder methods. These operators would have similar precedence to existing logical operators. The operators would be:

bNOT - boolean "not" bAND - boolean "and" bOR - boolean "or" bXOR - boolean "xor"

With corresponding dunder methods:

__bNOT__ and _rbNOT__ (or __r_bNOT__) __bAND__ and _rbAND__ (or __r_bAND__) __bOR__ and _rbOR__ (or __r_bOR__) __bXOR__ and _rbXOR__ (or __r_bXOR__)

The basic idea is that the "b" is short for "boolean", and we change the rest of the operator to upercase to avoid confusions with the existing operators. I think these operators would be preferably to the proposals so far (see [1] again) for a few reasons:

1. They are not easy to mistake with existing operators. They are clearly not similar to the existing bitwise operators like & or |, and although they are clearly related to the "not", "and", and "or" I think they are distinct enough that it should not be easy to confuse the two or accidentally use one in place of the other.

2. They are related to the operations they carry out, which is also an advantage over the existing bitwise operators.

3. The corresponding dunder methods (such as __bAND__ and _rbAND__) are obvious and not easily confused with anything else.

4. The unusual capitalization means they are not likely to be used much in existing Python code. It doesn't fall under any standard capitalization scheme I am aware of.

5. At least for english the capitalization means they are not easy to confuse with existing words. For example Band is a word, but it is not likely to be capitalized as bAND.

As to why this is useful, the overall problem is that the current logical operators, like and, or, and not, cannot be overloaded, which means projects like numpy and SQLAlchemy instead have to (ab)use bitwise operators to define their own boolean operations (for example elementwise "and" in numpy arrays). This has a variety of problems, such not having appropriate precedence leading to precedence errors being common, and the simple fact that this precludes them from using the bitwise operators for bitwise operations.

There was a proposal to allow overloading boolean operators in Pep-335 [2], but that PEP was rejected for a variety of very good reasons. I think none of those reasons (besides the conversation fizzling out) apply to my proposal.

So the alternative proposal that has been floating around is to instead define new operators specifically for this. Although there seemed to be some support for this in principle, the actually operators so far have not met with much enthusiasm. So far the main operators proposed so far seem to be:

1. Double bitwise operators, such as && and ||. These have the disadvantage of looking like they should be a type of bitwise operator.

2. the existing operators, with some non-letter character at the front and back, like ".and.". These have the advantage that they are currently not valid syntax in most cases, but I think are too similar to existing logical operators, to easy to confuse, and it is not immediately obvious in what way they should differ from existing operators. They also mean different things in other languages.

So I think my proposal addresses the main issues raised with existing proposals, but has the downside that it requires new keywords.

Thoughts?

[1] https://mail.python.org/pipermail/python-ideas/2015-November/037207.html [2] https://www.python.org/dev/peps/pep-0335/

Steven D'Aprano

1:02 a.m.

On Mon, Aug 06, 2018 at 02:44:24PM -0700, Neil Girdhar wrote:

...

Fortunately Todd isn't proposing that. -- Steve

2414

Age (days ago)

2420

Last active (days ago)

List overview

Download

32 comments

14 participants

participants (14)

Benedikt Werner
Chris Angelico
Chris Barker
Dan Sommers
David Mertz
Grégory Lielens
Jonathan Fine
Michel Desmoulin
MRAB
Neil Girdhar
Nicholas Chammas
Stephan Houben
Steven D'Aprano
Todd

Revisiting dedicated overloadable boolean operators

Dan Sommers

Stephan Houben

Dan Sommers

Stephan Houben

tags

participants (14)