[Python-ideos] Dedicated overloadable boolean operators

Hi, After reading PEP0465 <https://www.python.org/dev/peps/pep-0465/> about the dedicated matrix multiplication I started wondering if the same solution couldn't be applied to boolean operators as well. There currently are a lot of high profile libraries that have their own functions for boolean operators, like Numpy, Pandas or SQLAlchemy. They do this because the current boolean operators can't be overloaded. PEP0335 <https://www.python.org/dev/peps/pep-0335/> was created to solve this problem (and makes the problem more clear), but was rejected because it needed changes to the byte code for the boolean operators, which would make them slower. Currently some of these libraries resort to the bitwise operators (at least Pandas), but those don't bind as strong as comparison operators, which means you have to do comparisons like this: (series1 == 2) & (series2 == 3) That is why I propose to create new operators just like for matrix multiplication which can be used in libraries that need one. I'm not sure what the operators should look like, but my first guess would be &&, || and ! for and, or and not respectively. Is this an idea that sounds reasonable? Jelte

Could you provide some links to where this could be useful, and how code could be rewritten? I can see the desire for such a feature, I myself would have liked such an operator or keyword. If you get general approval on this list, you can then move on to write a PEP (as that's what's needed if you wish to add a new keyword and/or operator to the language). I'm +0 for now, and may change once you provide us with use cases in the wild.-Emanuel From: me@jeltef.nl Date: Mon, 23 Nov 2015 20:09:17 +0100 To: python-ideas@python.org Subject: [Python-ideas] [Python-ideos] Dedicated overloadable boolean operators Hi, After reading PEP0465 about the dedicated matrix multiplication I started wondering if the same solution couldn't be applied to boolean operators as well. There currently are a lot of high profile libraries that have their own functions for boolean operators, like Numpy, Pandas or SQLAlchemy. They do this because the current boolean operators can't be overloaded. PEP0335 was created to solve this problem (and makes the problem more clear), but was rejected because it needed changes to the byte code for the boolean operators, which would make them slower. Currently some of these libraries resort to the bitwise operators (at least Pandas), but those don't bind as strong as comparison operators, which means you have to do comparisons like this: (series1 == 2) & (series2 == 3) That is why I propose to create new operators just like for matrix multiplication which can be used in libraries that need one. I'm not sure what the operators should look like, but my first guess would be &&, || and ! for and, or and not respectively. Is this an idea that sounds reasonable? Jelte _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Some examples of current practice for SQLAlchemy (an SQL ORM) can be found here: http://docs.sqlalchemy.org/en/rel_1_0/orm/tutorial.html#common-filter-operat... An slightly adapted example is this: from sqlalchemy import or_, and_ query.filter(or_(User.name == 'ed', and_(User.name == 'wendy', User.age > 20))) With the new operators this could simply be rewritten to: query.filter(User.name == 'ed' || (User.name == 'wendy' && User.age > 20)) This is much clearer in my opinion. Pandas overloads the binary and and or operators, which causes the small issue that it needs an extra pair of braces around expressions, see https://stackoverflow.com/questions/24775648/element-wise-logcial-or-in-pand... or http://stackoverflow.com/a/19581644/2570866 This means that this (which selects the rows that either are lower than three or equal to five): df[(df < 3) | (df == 5)] Can be rewritten to this: df[df < 3 || df == 5] This is clearly little advantage, but it also means that there is no need to override the binary or and and. That way they can be used for their original purpose. For Numpy the case is again like with SQLAlchemy see : https://docs.scipy.org/doc/numpy/reference/generated/numpy.logical_and.html and https://docs.scipy.org/doc/numpy/reference/generated/numpy.logical_or.html I hope this made the use for the overloadable logical and and or operators clear. The not operator might be a bit less useful and I'm not sure it's needed as much. Currently SQLAlchemy and Pandas overload the "~" (invert) operator and Numpy has a function again: https://docs.scipy.org/doc/numpy/reference/generated/numpy.logical_not.html Lastly, for SQLAlchemy an in operator that does not return a boolean could also be useful. I can't think of use cases for the others though and I also can't directly think of an operator that would be as clear as the &&, || and ! operators. As for the PEP, I have no problem writing one if this is accepted as a useful addition. Also any suggestions and critiques are very welcome of course. On Monday, 23 November 2015 21:24:23 UTC+1, Emanuel Barry wrote:

On Tue, Nov 24, 2015 at 10:49 AM, Jelte Fennema <me@jeltef.nl> wrote:
I think it's reasonable, except for the potential confusion of having *three* "and" operators. The one with the word is never going to change - its semantics demand that it not be overridable. When should you use & and when &&? Judging by how @ has gone, I think the answer will be simple: "Always use &, unless the docs for some third-party library say to use &&", in which case I think it should be okay. ChrisA

I honestly think the added confusion makes it a non-starter. It's also confusing that in other languages that have && and ||, they are shortcut operators, but the proposed operators here won't be. And the real question isn't "when to use & vs. &&", it's "when to use 'and' vs. &&". On Mon, Nov 23, 2015 at 4:08 PM, Chris Angelico <rosuav@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

This confusion could quite simply be solved by just not implementing the operations on the standard types, even though they would be trivial to implement. This just leaves the possibility for library developers to do something useful with the operators, like with the new @ operator. On 24 November 2015 at 01:13, Guido van Rossum <guido@python.org> wrote:

I honestly think the added confusion makes it a non-starter.
Coming from my experience from the Numpy world, the fact that you get "rich comparisons" for most what seem like Boolean operators, but not for and and or is very confusing to newbies. Much of the time, you can use the bitwise operators, as you often have done a comparison first : (A < x) & (A > y) But it's kind of a coincidence that it works, so it only makes thing more confusing for newbies. Bitwise operators really are kind of obscure these days. Explaining that you use ".and." Instead of " and" would be a much lighter lift than getting into the whole explanation for why we can't overload "and". -CHB I think first we decide it is or isn't a good idea, and then decide how to spell it, but .and. and .or. kind of appeal to me.

On 24 November 2015 at 11:38, Chris Barker - NOAA Federal <chris.barker@noaa.gov> wrote:
I think first we decide it is or isn't a good idea, and then decide how to spell it, but .and. and .or. kind of appeal to me.
I think it's reasonably clear that rich logical operators with appropriate precedence and non-shortcircuiting behaviour would be a nice feature to have, the question is whether or not they can be introduced without making things even more confusing than they already are. Using placeholder syntax, let's consider the three operations: A rich_and B A rich_or B rich_not A To explain this fully will require explaining how they differ from: A and B -> A if not bool(A) else B A or B -> A if bool(A) else B not A -> not bool(A) and: A & B -> operator.and_(A, B) A | B -> operator.or_(A, B) ~A -> operator.not_(A) Depending on a user's background, it will also potentially require explaining how they differ from these operations in C and other languages: A && B A || B !A Casting any new forms specifically as "matrix" operators (like the new matmul operator Jelte referenced in the opening message of the thread) also runs into problems, since "@" is a genuinely distinct operation only applicable to matrices, while the goal here is instead to broadcast an existing operation over the array elements, which matrix objects are already able to do implicitly for most binary operators. Something that *could* potentially be comprehensible is the idea of allowing "elementwise" logical operators, with a suitable syntactic spelling. There'd still be a slight niggle to explain why and/or/not have explicitly elementwise variants when other binary operations don't, but that would likely still be less confusing than explaining the use of the bitwise operators. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 24 November 2015 at 16:05, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
While I'm still largely of the view that introducing additional operators would make things more confusing rather than less, I'm also convinced that if anything like this is going to be pursued without being incredibly confusing for beginners there needs to be a fairly concise answer to "What are these operators for?". Take the "bitwise operators", for example. The notion of a "bitwise operator" is conceptually dense for folks that have never worked with binary numbers before. Despite that, if someone asks "What do the & and | operators do in Python?" the semantics can still be conveyed relatively quickly using some truth table examples like:
Explaining "~" fully is a bit trickier (since you would need to explain why two's complement representations of binary numbers are useful), but it's possible to avoid that explanation by using the alternative arithmetic formulation for "~" given in https://wiki.python.org/moin/BitwiseOperators : "~x == -x -1" Matrix multiplication is another example of something that isn't particularly easy to explain to folks that aren't already familiar with the relevant domain, but also conveys clearly that you can ignore it if you're not working with matrices. It's then also useful to remember that the answers to "What is this for?" and "How is this used?" for a language construct can diverge over time. The original "What is this for?" use cases are the ones that guide the design decisions towards concrete answers that define how the construct works, and provide the underlying rationale for the way the construct behaves. The "How is this used?" cases then arise later when folks say "Yes, those existing semantics are suitable for my current use case, so I can reuse the syntax". Some specific examples: "+" is used not only for addition, but also sequence concatenation. "&" is not only "bitwise and", but also set intersection "/" is not only division, but also pathlib path joining NumPy repurposes most of the binary operators (including the bitwise ones) as element-wise matrix operations. SQL Alchemy repurposes a number of them for SQL query operations. SymPy changes them from arithmetic operations to symbolic ones. Those use cases don't change the answers to "What are these operators for?" from a language design perspective, they only change the answers to "How are these operators used?" from a practical perspective. Getting back to the specific topic of this thread, this could actually make an interesting usability study for a language design theorist, by looking at the kinds of mistakes folks make trying to learn elementwise logic operations in NumPy, and then seeing whether the introduction of overridable elementwise logical operators reduces the learning curve. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

While I'm still largely of the view that introducing additional operators would make things more confusing rather than less,
Well, almost by definition, more stuff to understand is more confusing for beginners.
there needs to be a fairly concise answer to "What are these operators for?".
I don't think "they are for doing logical operations on each of the elements in a sequence, rather than the sequence as a whole", along with an example or two is particularly challenging. In fact, much less so than the Bitwise operators, or matrix multiplication, which require a bit of domain knowledge, as you say. But those aren't a big problem either: "if you don't know what it means, you probably don't need it" But as for general element-wise operators: IIRC, this was discussed a lot back in the day -- and was driven by experience with e.g. Matlab, where the regular math operators do linear algebra by default, and there are alternative "element wise" operators. Numpy, on the other hand, does element-wise by default, so we wanted another set for linear algebra. However, we came to realize that the only one really needed was matrix multiply -- and thus the new @ operator. This all worked because Numpy could overload the math operators to be element wise, and once rich comparisons were implemented, that covered almost everything. So all that's left is and-or. Add the fact that use the Bitwise & and | in their place in most cases, and we've done fine so far. All that being said -- two more operators for "rich and" and "rich or". Would nicely complete the picture. I was just introducing my intro Python class to the magic methods last night -- there are a LOT of them! Two more is pretty trivial Addison of complexity. As long as we can find a way to spell them that is not too confusing or ugly -- I think it's a win-win. Note: having worked with array-oriented languages/libraries for a long time, I'd like element-wise operators that worked with all the built-in types. But I suspect that Python is never going to go there. So we only need these two. -CHB

On 26 November 2015 at 09:58, Chris Barker - NOAA Federal <chris.barker@noaa.gov> wrote:
Right, that's why I think "elementwise logical operators (which may potentially be useful for other things)" is an idea that has some hope of avoiding creating new barriers to learning: * if "elementwise" doesn't mean anything to you, and no library you're using mentions them in its documentation, you can ignore them * for folks that do know what it means, "elementwise" is evocative of the relevant semantics for at least the data analysis use case
But as for general element-wise operators:
I wasn't suggesting those - just element-wise logical operators. In terms of scoping the use case: bitwise and/or/not work fine for manipulating existing boolean masks, and converting a single matrix to a boolean mask involves apply an elementwise function, not elementwise logical operations. So elementwise logical operators would presumably be aimed at *data merging* problems - creating a combined matrix where some values are taken from matrix A and others from matrix B, based on the truthiness of those values. I'm not enough of a data analyst to know how common that problem is, or whether it might be better served by a higher level "replace_elements" operation that accepts a base array, a replacement array, and a boolean mask saying which values to replace. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Nov 25, 2015, at 21:44, Nick Coghlan <ncoghlan@gmail.com> wrote:
But "elementwise" isn't what people doing symbolic computation or most other uses of DSL/expression-tree libraries are doing. Even for ORMs and other query-based libraries like AppScript, where arguably it is what they're doing, they probably aren't thinking of it that way, and wouldn't recognize that it should mean something to them, much less that it's what they're looking for. So I think this is effectively less general/useless than a solution that just allows overloading boolean operators somehow, without adding a distinction between elementwise (and overloadable) and objectwise (and not).
When I use NumPy, sometimes I'm doing GPU-ish stream operations, which need things like compacting select, which aren't obviously expressible in boolean terms, so I end up looking for methods for everything rather than operators even when they might make sense. But otherwise, when I'm doing more typical NumPy stuff (or at least what I think is more typical, but I could easily be wrong), I look for elementwise operators all over the place, including abusing the bitwise operators when it makes sense, so I probably would use real boolean operators if it were more obvious/readable.

Andrew Barnert via Python-ideas wrote:
But "elementwise" isn't what people doing symbolic computation or most other uses of DSL/expression-tree libraries are doing.
Right. I think just describing them as "overloadable versions of the boolean operators" would be best. What they mean is up to the types concerned, just as with all the oher operators. -- Greg

On 27 November 2015 at 06:54, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Except that *isn't* what we do with the other operators. "&" is the bitwise and operator, for example - that's why the special method is called "__and__". The fact you can use it for an elementwise bitwise and operation on NumPy arrays, or for set intersection, isn't part of the core design. If there isn't *at least one* specific motivating use case, then "it might be useful for something" isn't a good reason to add new syntax. However, looking again at PEP 335, I'm not sure I see any reason it needs to noticeably slower in the standard case than the status quo (I'm not saying it would be *easy* to retain the speed, but the complexity would be in the eval loop implementation and the code generation process, not user code). We also have the richer benchmark suite these days to actually quantify the impact of checking for the new __and1__/__or1__ slots before falling back to __bool__, and tracing JIT's would still be able to generate appropriate code for the fast path at runtime. So perhaps it might be worth dusting off that original idea and seeing what the impact is on the performance benchmarks? Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Nov 26, 2015, at 18:11, Nick Coghlan <ncoghlan@gmail.com> wrote:
Sure. But the sqlanywhere case was the very first motivating use the OP mentioned, before NumPy, not something that may come up in the future that we haven't imagined yet. Also, the distinguishing thing about this new magic method vs. the existing __and__ isn't that it's elementwise, but that it's boolean/logical rather than bitwise/arithmetic, even for NumPy users. So, calling it "elementwise and", or giving it a name that implies elementwise, will confuse anyone who hasn't read this whole thread. And finally, NumPy is one of the uses that doesn't require short circuiting, and the same is almost certainly true for other elementwise uses, and yet we seem to all be agreed that the new overload has to be short-circuitable.

On 2015-11-27 09:28, Andrew Barnert via Python-ideas wrote:
Do we? I don't. I agree that if we add a way to overload the existing and/or to support these new usages, then that has to be short-circuitable. because and/or currently are short-circuitable and we can't get rid of that. But to me one of the attractive aspects of this new proposal is that the new operators need not be short-circuitable, which would avoid the various contortions required in a scheme like PEP 335 and thus greatly simplify the overloading. In other words the whole point of these new operators would be to do and-like and/or or-like operations that definitely do want both of their arguments all the time (such as elementwise operations or combining abstract query objects like in these SQL cases). -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

Nick Coghlan writes:
Well, for the social science data analysis I do, that would be inappropriate. Variables from different sources are different variables, you wouldn't just "or" them into a single column of a data frame. You would want your data model to account for the fact that even if they purport to measure the same factor, they're actually different indicators. But for a completely different kind of data, images, that sounds a lot like (Duff's?) compositing operations. But those are a lot more flexible than just "and" and "or": color images are "fuzzy" logic, and so admit many more logical operations (eg, "clamped sum", "proportional combination", etc. I'm not sure how that would fit here, since Python has only a limited number of operator symbols, fewer than there are compositing operations IIRC. Steve

Regarding the short-circuiting `and` and `or`, I think there's a way we can have our overloading-cake and only eat the pieces we need too. Even though it might not be totally intuitive, it should be possible to make the overloaded method get a 0-argument function instead of a value. In that case, `a and b` would become `a.__land__(lambda: b)`. As far as performance goes, there probably would have to be some special casing that first checks if `a` has an overloaded `and`, and if not use the default behaviour. The default implementation: class object: def __land__(self, other_f): if not self: return self else: return other_f() def __lor__(self, other_f): if self: return self else: return other_f() And for some python-expressions-to-AST nodeclass: class BaseExpression: def __land__(self, other_f): return AndExpression(self, other_f()) def __lor__(self, other_f): return OrExpression(self, other_f()) Again, the fact that the second argument to `__land__`/`__or__` is a function might be a bit confusing, but that's probably going to be the only way to make short-circuiting work for an overloaded and/or without going the route of lazy evaluation. On Mon, Nov 23, 2015 at 04:13:42PM -0800, Guido van Rossum wrote:

On Mon, Nov 23, 2015 at 4:08 PM, Chris Angelico <rosuav@gmail.com> wrote:
I think it's reasonable, except for the potential confusion of having *three* "and" operators.
I think using && and || would be an attractive nuisance for people switching from another programming language. Right now if I accidentally right && in Python or "and" in another language, I get an immediate syntax error. With this proposal, I get unexpected results. If this idea were to fly, a better name would be something that doesn't have that problem, e.g., .and. .or. .not. I don't want to bikeshed the exact syntax**, but I think it should be clear that with something like this: (1) no one is going to accidentally type them and (2) they are pretty clearly some variation of the standard and/or/not. **Lots of other possibilities that are syntax errors right now: @and, (and), etc. I like .and. because it's less visual clutter and it's easy to type. --- Bruce Check out my puzzle book and get it free here: http://J.mp/ingToConclusionsFree (available on iOS)

On 24 November 2015 at 01:26, Bruce Leban <bruce@leban.us> wrote:
I think a naming scheme like that is indeed a good way to solve the confusion issues, since it is also immediately clear that these would be a special version of the normal operators. Another advantage is that it could also be extended to the in operator, if that one is to be included. I'm not sure I like the dots version very much though, but like you said there are lots of syntax error options to choose from. On 24 November 2015 at 01:26, Bruce Leban <bruce@leban.us> wrote:

Why hasn't SQLAlchemy gone the route of NumPy with overloaded operators? Perhaps whatever reason it is would prevent using any new operators as well. With NumPy I make that mistake constantly: A == a & B == b rather than (A == a) & (B == b) I'd put that in the category of parentheses tax along with the print function and old style % string interpolation. Annoying, but it's inappropriate to use a gun to swat a fly. (In case my metaphor is unclear, creating a new operator is the gun -- risking collateral damage and all that) As Guido said, the real usability problem is that the ``and`` operator is a new Python programmer's first instinct. Adding yet another operator would make Python harder to learn and read. Even if you advertise a new operator, many libraries will be slow to change and we'll have 3 different techniques to teach. Let's weigh the benefits against the negative consequences. On Mon, Nov 23, 2015 at 8:12 PM Jelte Fennema <me@jeltef.nl> wrote:

On 24 November 2015 at 02:44, Michael Selik <mike@selik.org> wrote:
Why hasn't SQLAlchemy gone the route of NumPy with overloaded operators?
It seems I was wrong about that, they apparently do: http://stackoverflow.com/a/14185275/2570866 I'd put that in the category of parentheses tax along with the print
function and old style % string interpolation.
This seems like a bit of a weird argument since the parentheses for the print function are put there for a reason (see PEP3105) and the old style % string interpolation will be replaced by the new format string literal. Adding yet another operator would make Python harder to learn and read.
Even if you advertise a new operator, many libraries will be slow to change and we'll have 3 different techniques to teach.
I don't think that much confusion will arise, since the normal way is to to use is the short `and` version. Only in libraries where it would be explicitly told the new operator would be used. It would also do away with the confusion about why and cannot be overridden and why the precedence of the & operator is "wrong". On 24 November 2015 at 02:44, Michael Selik <mike@selik.org> wrote:

As I think of people's reactions to seeing & and | for the first time, the typical response is, "What do you mean by bitwise?" Not, "Why are you using bitwise operators for non-bitwise operations?" Interestingly, no one in this thread seems to have a problem with ``&`` and ``|`` for set intersection and union. The primary complaint is that NumPy users instinctively reach for ``and``/``or`` and then forget the operator precedence of ``&``/``|``. On Mon, Nov 23, 2015 at 9:09 PM Jelte Fennema <me@jeltef.nl> wrote:
Using pipe and ampersand looks readable. The use of bitwise operators for overloading and/or/not seems standard for many objects, in standard library and major projects. In fact, I use those operators as logical far more often than as bitwise. I'd bet a great number of NumPy users in the science community are completely unaware of their bitwise effects. I'd put that in the category of parentheses tax along with the print
I picked ``print`` as an example to show that requiring parens is fine. I picked string interpolation as an example because the ``.format`` solution also accepted parens as necessary. The rationale for f-string syntax (as written in the PEP) does not complain about parens but instead mentions that they will be necessary inside the string in several circumstances, like lambdas. Adding yet another operator would make Python harder to learn and read.
That precedence issue doesn't cause problems when teaching, in my experience. People accept readily that there's issues with operator precedence. On the other hand, they have trouble reading code that uses an older style (perhaps using ``&``) instead of the newer style that they learned (the hypothetical ``&&``). Even the switch between ``%`` interpolation and ``.format`` is still causing problems in large organizations.

On 24 November 2015 at 10:14, Michael Selik <mike@selik.org> wrote:
Using pipe and ampersand looks readable. The use of bitwise operators for that people accept the wrong operator precedence readily, even though before you said you do it wrong yourself constantly. This is also why I think the set intersection and union are not a problem is because the operator precedence is correct there. Your point that the problem is mostly that the old code will still be usin the ``&`` operator, which could confuse people is true. But I also think that would eventually disappear, which would make Python better in the future. Another thing is that it seems some people are worried about the form of the new operators. Some options (which don't use the C style &&, because that would indeed cause confusion): .and. *and* +and+ %and% @and@ ?and? ^and^ <and> {and} (and) [and] |and| :and: _and_ (could currently be a variablename) en (the Dutch version) Some of these could also be used with just one special character, like @and, but I think the surrounded ones look more visually pleasing. These are just a couple of examples and some of them seem fine to me. I do think that it is important though to not focus on the form already. It seems better to first figure out if the new operators would not confuse the newcomers to much in whatever form they come. On 24 November 2015 at 10:14, Michael Selik <mike@selik.org> wrote:

On Tue, Nov 24, 2015 at 5:00 AM Jelte Fennema <me@jeltef.nl> wrote:
I make that mistake in an interactive environment and fix it moments later, so it's not a big thing for me. I also occasionally forget to put a colon at the end of my for-loops, etc. ;-)
What's the half-life of deprecated code? That stuff is like nuclear waste. Or more like a bacteria you're spot-treating with antibiotic. It keeps replicating while you apply ointment and might evolve a resistance. Ok, ok, I'm getting a little too colorful there.

agreed -- but unfortunately, numpy may be one place where people really do want bitwise operations sometimes -- and it's way too late now to re-define them anyway.
The primary complaint is that NumPy users instinctively reach for them, or will see them in code and think they ARE another way to spell and and or. It also doesn't allow element-wise short circuiting behavior. you get: In [*2*]: np.array([0,1,2,3]) & np.array([3,2,1,0]) Out[*2*]: array([0, 0, 0, 0]) when you want: In [*3*]: [ a and b for a,b in zip([0,1,2,3], [3,2,1,0])] Out[*3*]: [0, 2, 1, 0] oh, and this: In [*4*]: np.array([0,1,2,3]) | np.array([3,2,1,0]) Out[*4*]: array([3, 3, 3, 3]) would surely confuse people that don't "get" what bitwise means, even though it results in teh same thing in a boolean context: In [*5*]: (np.array([0,1,2,3]) | np.array([3,2,1,0])).astype(np.bool) Out[*5*]: array([ True, True, True, True], dtype=bool) Honestly, I'm not sure element-wise short circuiting is desired, but it would be nice to be consistent about that. As for precedence, it's a annoying, but at least in my experience, the very first test case fails, and then I remember to add the parens. As for the more and more operators issue -- for the most part, this would be used for special purposes -- most people wouldn't even notice there were there, and numpy users might well think it' s special numpy thing (same for SQLAlchemy folks, or...) Unless folks wanted to add support for element-wise and and or to the built-in sequence types. I don't think anyone has proposed that. Your point that the problem is mostly that the old code will still be using
the ``&`` operator, which could confuse people is true.
well, yes - but code using & would continue to work the same way -- and anyone confused would, in fact, have been mis-interpreting what & meant already :-) As for element-wise everything -- THAT discussion has happened multiple times in the past, and been rejected. Time to go dig into the archives -- not sure if there is rejected PEP for it. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Mon, Nov 23, 2015 at 5:44 PM, Michael Selik <mike@selik.org> wrote:
It's not just the operator precedence that's an issue here -- this also means something different than what people expect. As it happens a bitwise and works "just like" and when the values are boolean (like above), but that isn't always the case -- confusion waiting to happen. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

(Top-posting because this is really a reply to a combination of three earlier posts, not to Greg Ewing's post, except at the very end.) -1 on "&&" and "||". To anyone familiar with C and friends, it seems like they ought to be like C (short-circuiting, and more generally the same thing we already spell "and" and "or"); to anyone else, it would make no sense for them to have different precedence than "&" and "|". -0.5 on ".and." and ".or." for multiple reasons, most of which apply just as well to anything similar: * That looks like a general custom-infix syntax. Anyone coming to Python (including Python 2 users) will expect that they can just as easily define ".spam." (and then be disappointed that they can't...), or will be worried that 3.7 will add a bunch of new dot operators they'll have to learn. Not a _huge_ negative, but already enough to turn me off the idea. * What would you call the dunder methods (and operator module functions), and how would you deal with the fact that any novice/transplant is going to assume "__and__" means ".and." rather than "&"? * Operators starting with "." are ambiguous with float literals and/or attribute access ("spam.and.eggs" looks like a member of a member of spam), at least to humans, if not to the parser. * Operators starting and/or ending with "." are hard to talk about because of natural-language punctuation (and even harder on a mobile keyboard, or an overly clever text editor or word processor). -0 on actually adding a general custom-infix syntax, come to think of it. Any solution to the problems above should work just as well here. And it means that in the future, libraries don't have to cram things into inappropriate symbols or be stuck with prefix or dot-method notation. And I don't think it would be any harder to learn than a few special cases. And I think ".between." would be just as useful in an ORM as ".in.". More generally, the whole point of "@" was that everyone agreed that it was the only new operator anyone would need (except maybe "@@") for a decade or two; if that's not true, infinity is a better number than 4. +1 on the idea if someone can come up with a good spelling that avoids all the above problems and reads as naturally as "@".
Is that last one supposed to be "niet"? When I try "nlet", Google assumes it's a typo for "net", which I guess could be a unary boolean identity operator, but I don't think we need that. :) Anyway, that pattern is a bit hard to extend to an overloadable "in" operator, because Dutch for "in" is "in".

On 24 November 2015 at 06:35, Andrew Barnert via Python-ideas < python-ideas@python.org> wrote:
This seems like an interesting idea, but I think it would be hard finding a notation that does not conflict with current operations. For the and/or almost any symbol would work, since they are keywords and the operators have no meaning on them, but if any name can be used operations on variables will become unclear. So another symbol would probably have to be used that isn't used yet, like the $ symbol. Also, I think there are probably some other issues that I haven't thought of, since this would add a pretty big language feature. PS. I didn't include the dollar convention in my last options list for the new and operator, but it is ofcourse also a possibility: $and$, or maybe $and. On 24 November 2015 at 06:35, Andrew Barnert via Python-ideas < python-ideas@python.org> wrote:

On Nov 24, 2015, at 02:10, Jelte Fennema <me@jeltef.nl> wrote:
The obvious notation is "a `spam` b", as used in Haskell and various other languages. It doesn't conflict with current operations. It doesn't look like a pair of operators, but like a matched bracketing or quoting, which is exactly what we'd want. It has just about the right screen density (. can be easy to miss, while $ is so heavy that it makes the "in" in $in$ harder to read). It doesn't conflict with a different meaning elsewhere, like $(in), which can look like part of string template to a human reader. Also, experience with those other languages shows that it works. In particular, in Haskell, I could always just define any new string of symbols to call spam on its operands, and with any precedence and associativity I want--but it's often much more readable to just use `spam` instead. Its biggest problem is that Guido hates backticks, and is on record as promising that they will never be reused in Python now that they no longer mean repr. Since I don't think he'd accept the idea anyway (IIRC, last time it came up, he said it wasn't even worth writing a PEP to categorically dismiss), that makes this the perfect syntax for it. :) More seriously, my point that infinity is a better number than 4 is more an argument against adding custom and, or, not , and in operators than an argument for adding custom arbitrary operators, which means the spelling isn't that important anyway. I don't think you're going to do much better than `and`, and I don't think that's good enough for Python.
For the and/or almost any symbol would work, since they are keywords and the operators have no meaning on them,
No, because code has to be readable and parseable by humans, not just by compilers. Even if the compiler can tell that "spam.and.eggs" or "2.in.e" aren't using dots for member access or float purposes, that isn't clear to a human reader until you look at it carefully. (Of course you can always use `and` or some other syntax that wouldn't be confusing even if and weren't a keyword, but at that point you're back to "no better than custom infixes".) Also, bear in mind that what you're competing with is a.spam(b), so anything that looks too heavyweight or too weird, like a $(spam) b, isn't going to be much improvement except in very long expressions where closing the parens gets troublesome (which often aren't going to be readable anyway, and maybe it's not so terrible to encourage people to break them up and name subparts).
but if any name can be used operations on variables will become unclear. So another symbol would probably have to be used that isn't used yet, like the $ symbol. Also, I think there are probably some other issues that I haven't thought of, since this would add a pretty big language feature.
Actually, last time I looked into it (around 3.4), it didn't have any other big issues or complex interactions with other features (and I don't think static typing, await, or any other changes will affect that, but I'd have to look more carefully). And it's pretty simple to implement, and pretty easy to describe. If you're interested, I may have written up a blog post, and if I didn't, I could write one now. (I think I also have a hack that implements it with a retokenizing import hook, if you want to play with it.) While we're at it, every time custom infix operators come up, someone points out that you can already fake it pretty well. For example, "a &And& b" just needs a global named "And" whose __rand__(self, a) returns an object whose __and__(self, b) does what you want. And this allows you to use whatever precedence you want by picking the surrounding operators (much like Swift's custom infix operators get their precedence from the first symbol in the name), and lets you implement operator sectioning (you can make __rand__(self, a).__call__ the same as its __and__, and then you can pass around "a &And" as a partial function), and so on. There are a few glitches, but it works well with the kind of code you'd write with an ORM. And again, this is just as true for a proposal to add four custom operators as for a proposal to add a general feature. So, you have to think about why sqlanywhere, numpy, etc. have decided not to use this trick so far, when it's been well known since around the days of Python 2.4, and why they'd be better off with custom operators that don't look that different from the fake ones.

On 2015-11-24, Chris Angelico <rosuav@gmail.com> wrote:
How? Are you referring to the short-circuit? C# allows overloading the short-circuit operators by doing it in two parts - in pseudocode: a and b: a if a.__false__() else a & b a or b: a if a.__true__() else a | b There's no fundamental reason short-circuiting "demands" that it not be overridable, just because C++ can't do it.

On 2015-11-23 22:34, Random832 wrote:
The problem is that this kind of overriding doesn't handle the main use case, which is elementwise and-ing/or-ing. If a is some numpy-like array of [1, 1], then it may be boolean true, but you still want "a magic_or b" to do the elementwise operation, not just return "a" by itself. (I say "numpy-like" because a real numpy array will fail here precisely because its boolean truth value is undefined, so it raises an error.) The problem is that, with the existing and/or, the issue of the boolean truth values of the operands is entangled with the actual and/or operation. For an elementwise operation, you *always* want to do it elementwise, regardless of whether the operands "count" as boolean false in some other situation. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

Brendan Barnwell wrote:
Maybe we can hook into __bool__ somehow, though? Suppose 'a and b' were treated as: try: result = bool(a) except IDoNotShortCircuit: result = a.__logical_and___(b) else: if not result: result = b Since a __bool__ call is required anyway, this shouldn't slow down the case where there is no overriding. -- Greg

MRAB wrote:
That's true. I guess the only solution that really works properly is to have a second set of operators, or some way of flagging that you're using non-short-circuiting semantics. There's one language I've seen -- I think it was Eiffel, or maybe Ada -- that had two sets of boolean operators. But it was kind of the other way around: 'and' and 'or' were non-short-circuiting, and to get short-circuiting you had to say 'and then' or 'or else'. -- Greg

On November 24, 2015 4:45:45 PM CST, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
That was a lot of languages: - Ada: and then, or else - Algol: ANDTHEN, ORELSE - Erlang: andalso, orelse - Extended Pascal: and_then, or_else - GNU Pascal: and then, or else - Oz: andthen, orelse - Standard ML: andalso, orelse - Visual Basic: AndAlso, OrElse Interestingly, Visual Basic allows you to overload non-short-circuiting And. To overload AndAlso, you need to overload And and IsFalse. It seems that AndAlso was roughly equivalent to And + IsFalse: A AndAlso B = If Not (IsFalse A) Then A And B Else A Endif -- Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.

On Tue, Nov 24, 2015 at 5:34 PM, Random832 <random832@fastmail.com> wrote:
The Python semantics are defined more tightly than that, though - the "else" clause in each case would simply be "b". Changing that is not something you can do with operator overloading, so it would mean a fundamental change in the operator's semantics. And once you do that, you end up with a completely different operator. So, yes, the semantics of Python's short-circuiting 'and' and 'or' operators precludes any form of overriding. Yes, it's possible to have overridable short-circuiting operators, but Python's ones are not those. (Also, I think I prefer the simpler semantics. But that's a matter of personal choice.) ChrisA

On 2015-11-24, Chris Angelico wrote:
I don't see that as "fundamental". Certainly it can't _actually_ be a & b, but it could certainly be "a __foo__ b" where the default implementation of the __foo__ method (on object) simply returns b. If that's "fundamental" then adding overloading to _any_ operator that didn't support it in Python 1.0 violates those operators' "fundamental" behavior of raising a TypeError when applied to types they did not support in Python 1.0. This would also allow b.__rfoo__, to address the objection Nathaniel Smith raised, of the right-hand-side not being able to override. It can't override "does it short-circuit", since that'd defeat the point of short-circuiting, but it can override the value in the case where it doesn't, and a well-behaved 'elementwise-"and"able' type would work either way, only short-circuiting if every element of the left side is false (or true for "or").

On Nov 23, 2015, at 22:34, Random832 <random832@fastmail.com> wrote:
Actually, it _is_ possible in C++. You can pretty much do anything in C++, as long as you're an expert, you don't care about anyone reading your code, and you don't mind 1970s-style build times, and this is no exception. You just write an expression template library that uses non-short-circuiting-at-compile-time operators to generate functions that are short-circuiting at runtime, and then add the boilerplate to wrap any constants or normal variables up into your expression template types, and you're set. A simpler solution is to use lambda lifting, like Swift: "a and b" just means "a.__and__(lambda: b)". You could also split the operator into two method calls, without combining with bool conversion a la C#. So "a and b" becomes "tmp = a.__and1__()", then "tmp.__and2__(b) if tmp2 else tmp2". That way, "and" can return a non-boolean value, just as it does when not overloaded. I can't think of any language that does this off the top of my head, but I'm sure they exist. Or, in languages with macros or equivalent, you just use a macro instead of a function. Or, in languages with lazy evaluation by default, it's even simpler--if you don't use the result of "b" anywhere, it never gets evaluated.

On Nov 23, 2015 22:34, "Random832" <random832@fastmail.com> wrote:
No, I disagree -- for us short circuiting makes overriding extremely difficult, probably impossible, because in python in general and in the use cases being discussed here in particular, the right-hand side argument should have a chance to overload. It's going to be brutal on newbies and general code comprehensibility if True and array_of_bools or True and sqlalchemy_expression act totally differently than the mirrored versions, but this is an unavoidable if you have both short circuiting and overloading. (Example of where you'd see code like this: if restrict_user_id is not None: user_id_constraint = (MyClass.user_id == restrict_user_id) else: user_id_constraint = True if restrict_month is not None: month_constraint = (MyClass.month == restrict_month) else: month_constraint = True ... return query(MyClass).filter(user_id_constraint and month_constraint and ...) Yes, this could be written differently, but this pattern comes up fairly often IME. And it's otherwise very reliable; in numpy in particular 'True op array' and 'array(True) op array' act identically for every overloaded binary operator.) -n

On 2015-11-23 22:34, Random832 wrote:
The problem is that this kind of overriding doesn't handle the main use case, which is elementwise and-ing/or-ing. If a is some numpy-like array of [1, 1], then it may be boolean true, but you still want "a magic_or b" to do the elementwise operation, not just return "a" by itself. (I say "numpy-like" because a real numpy array will fail here precisely because its boolean truth value is undefined, so it raises an error.) The problem is that, with the existing and/or, the issue of the boolean truth values of the operands is entangled with the actual and/or operation. For an elementwise operation, you *always* want to do it elementwise, regardless of whether the operands "count" as boolean false in some other situation. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

To everyone claiming that you can't overload and/or because they are shortcut operators, please re-read PEP 335. It provides a clean solution -- it was rejected because it adds an extra byte code to all code using those operators (the majority of which don't need it). -- --Guido van Rossum (python.org/~guido)

On Nov 24, 2015 10:21 AM, "Guido van Rossum" <guido@python.org> wrote:
The semantic objection that I raised -- short circuiting means that you can't correctly overload 'True and numpy_array', because unlike all other binops the overload must be defined on the left hand argument -- does apply to PEP 335 AFAICT. This problem is IMHO serious enough that even if PEP 335 were accepted today I'm not entirely sure that numpy would actually implement the overloads due to the headaches it would cause for teaching and code review -- we'd have to have some debate about it at least. (Possibly useful analogy: Having to always double check that the array argument appears on the left rather than the right is kinda like how the old 'a and b or c' idiom forces you to constantly keep an extra special rule in the back of your head and always double check that b cannot be falsey whenever you read or write it, which IIUC was one of the major reasons why it was considered an insufficient substitute for a real ternary operator.) -n

On Nov 24, 2015 11:53 AM, "Guido van Rossum" <guido@python.org> wrote:
We'd certainly love it if there were a better alternative, but -- speaking just for myself here -- I've hesitated to wade in because I don't have any brilliant ideas to contribute :-). The right-hand-side overload problem seems like an inevitable consequence of short-circuiting, and we certainly aren't going to switch 'and'/'or' to become eagerly evaluating. OTOH none of the alternative proposals mooted so far have struck me as very compelling or pythonic, if only because all the proposed spellings are ugly, but, who knows, sometimes something awesome appears deep in these threads. Two thoughts on places where it might be easier to make some progress... - the 'not' overloading proposed in PEP 335 doesn't seem to create any horrible problems - it'd be a minor thing, but maybe it's worth pulling out as a standalone change? - the worst code expansion created by lack of overloading isn't a == 1 and b == 2 becoming (a == 1) & (b == 2) but rather 0 < complex expression < 1 becoming tmp = complex expression (0 < tmp) & (tmp < 1) That is, the implicit 'and' inside chained comparisons is the biggest pain point. And this issue is orthogonal to the proposals that involve adding new operators, b/c they only help with explicit 'and'/'or', not implicit 'and'. Which is why I was sounding you out about making chained comparisons eagerly evaluated at the bar at pycon this year ;-). I have the suspicion that the short-circuiting semantics of chained comparisons are more surprising and confusing than they are useful and we should just make them eagerly evaluated, and I know you have the opposite intuition, so I think the next step here would be to collect some data (run a survey, scan some code, ...?) to figure out which of us is right :-). The latter idea in particular has been on my todo list for at least 6 months and still has not bubbled up near the top, so if anyone is interested in pushing it forward then please feel free :-). -n

On 26 November 2015 at 06:42, Nathaniel Smith <njs@pobox.com> wrote:
Regardless of which is more useful it would be a very subtle backwards compatibility break. I imagine that code that relies on the short-circuit here is rare. I'm confident that it is much rarer than numpy-ish code that works around this in the way you showed above or just by evaluating the complex expression twice as in (0 < complex expression) & (complex expression < 1) which is what I normally write when the expression isn't too long. But there's guaranteed to be some breakage and a good migration path to mitigate that is unclear. You could add a __future__ import and a warning mode to detect when short-circuiting happens in chained comparisons. Unfortunately the naive implementation of the warning mode would simply trigger on 50% of chained comparisons (every time the left hand relation is False). I would definitely use chained comparisons for numpy arrays if it were possible but at the same time I don't think the status quo on this is that bad. It's a bit of a gotcha for new numpy users to learn but numpy is good at giving the appropriate error messages: >>> from numpy import array >>> 1 < array([1, 2, 3]) < 2 Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() >>> 1 < array([1, 2, 3]) and 3 Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() If you google that error message there's lots of SO posts etc. that can explain in more detail. It's also worth noting since we're comparing with Matlab that Matlab actually has both short-circuit logical && and element-wise logical & equivalent to Python's and/& respectively. And Matlab doesn't have chained comparisons or rather they don't do anything nearly as useful as Python's chained comparisons i.e. in Matlab -2 < -1 < 0 is evaluated as: -2 < -1 < 0 (-2 < -1) < 0 1 < 0 0 (i.e. False) which is basically useless and doesn't even give a decent error message like numpy does. -- Oscar

On Wed, Nov 25, 2015 at 10:42 PM, Nathaniel Smith <njs@pobox.com> wrote:
This seems pretty harmless, and mostly orthogonal to the rest -- except that overloadable 'not' is not very attractive or useful by itself if we decide not to address the others.
What to do with chaining comparisons is a very good question, but changing them to be non-short-circuiting would cause a world of backwards incompatible pain. For example, I've definitely written code where I carefully arranged the comparisons so that the most expensive one comes last (in hopes of sometimes avoiding the time spent on it if the outcome is already determined). This is particularly easy with chained ==, but you can sometimes also change a < b < c into c > b > a, if a happens to be the expensive one. However, note that PEP 335 *does* address the problem of chained comparisons head-on -- in a < b < c, if a < b returns a numpy array (or some other special object that's not falsey), it will then proceed to compute b < c and combine the two using the overloading of the default 'and'; because the first result is not a simple bool, the problem you described with `True and numpy_array` does not apply. So, maybe you and the numpy community can ponder PEP 335 some more? Honestly, from the general language design POV (i.e., mine :-), PEP 335 feels more acceptable than introducing new non-short-circuit and/or operators. How common would the `True and numpy_array` problem really be? I suppose any real occurrences would not use the literal True; `True and x` is just a wordy way to spell x, and doing this element-wise would just return the array x unchanged. (Or would it cast the elements to bool? That still feels like a unary operator to me that deserves a more direct spelling.) I don't see a use case for a literal left operator in symbolic algebra or SQL either. But let's assume we have some scalar expression that evaluates to a bool. Even then, `x and numpy_array` feels like a clumsy way to spell `numpy_array if x else <an array of the same shape filled with False>`. But I suppose you've thought about this more than I have. :-) -- --Guido van Rossum (python.org/~guido)

On 2015-11-24, Nathaniel Smith wrote:
The semantic objection that I raised -- short circuiting means that you can't correctly overload 'True and numpy_array',
Well, you mean "True or" or "False and", but anyway... I've got to confess, I don't really understand the semantic paradigm under which it's appropriate to do this all, but it's not appropriate for it to return True. If you can pass in True here, that implies that the True you'll get out of this operation is an appropriate value for using in further operations on similarly-shaped arrays. What does it _matter_ that you get True (or 1, etc) instead of [True, True, True, True, True, True]?

What does it _matter_ that you get True (or 1, etc) instead of [True, True, True, True, True, True]?
Sometimes you want to know that everything in an array is True ( which is common enough that numpy.alltrue() exists) But other times you want to know which items in an array a true, and which are not -- most often to use as a Boolean mask -- in which case, an array that happens to have all true values is a different beast altogether from a single True. And remember that thee could be arrays of any number of dimensions. Also: Python Truthyness rules define an empty container as False, and any non-empty container as True -- it's probably better not to make arrays unique in that regard. -CHB

On 2015-11-24, Brendan Barnwell wrote:
Why? I am asking specifically because *all* elements of a are true, so all elements of the result will also be true, and taken from a. Clearly [0, 1] should *not* return true for a.__true__(), which is why I didn't use __bool__. But the elementwise operation [1, 1] magic_or [0, 2] returns [1, 1].

On 2015-11-24 11:07, Random832 wrote:
I guess I'm not understanding what you mean by __true__ then. What is this __true__ for which ([0, 1]).__true__() is not true? -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

On 2015-11-24, Brendan Barnwell wrote:
I guess I'm not understanding what you mean by __true__ then. What is this __true__ for which ([0, 1]).__true__() is not true?
It'd be a new operator / magic-function, so whatever the person writing the class wants it to be. I specifically didn't use bool(). def __true__(self): # return True ony if you want "or" to short-circuit pass

Could you provide some links to where this could be useful, and how code could be rewritten? I can see the desire for such a feature, I myself would have liked such an operator or keyword. If you get general approval on this list, you can then move on to write a PEP (as that's what's needed if you wish to add a new keyword and/or operator to the language). I'm +0 for now, and may change once you provide us with use cases in the wild.-Emanuel From: me@jeltef.nl Date: Mon, 23 Nov 2015 20:09:17 +0100 To: python-ideas@python.org Subject: [Python-ideas] [Python-ideos] Dedicated overloadable boolean operators Hi, After reading PEP0465 about the dedicated matrix multiplication I started wondering if the same solution couldn't be applied to boolean operators as well. There currently are a lot of high profile libraries that have their own functions for boolean operators, like Numpy, Pandas or SQLAlchemy. They do this because the current boolean operators can't be overloaded. PEP0335 was created to solve this problem (and makes the problem more clear), but was rejected because it needed changes to the byte code for the boolean operators, which would make them slower. Currently some of these libraries resort to the bitwise operators (at least Pandas), but those don't bind as strong as comparison operators, which means you have to do comparisons like this: (series1 == 2) & (series2 == 3) That is why I propose to create new operators just like for matrix multiplication which can be used in libraries that need one. I'm not sure what the operators should look like, but my first guess would be &&, || and ! for and, or and not respectively. Is this an idea that sounds reasonable? Jelte _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Some examples of current practice for SQLAlchemy (an SQL ORM) can be found here: http://docs.sqlalchemy.org/en/rel_1_0/orm/tutorial.html#common-filter-operat... An slightly adapted example is this: from sqlalchemy import or_, and_ query.filter(or_(User.name == 'ed', and_(User.name == 'wendy', User.age > 20))) With the new operators this could simply be rewritten to: query.filter(User.name == 'ed' || (User.name == 'wendy' && User.age > 20)) This is much clearer in my opinion. Pandas overloads the binary and and or operators, which causes the small issue that it needs an extra pair of braces around expressions, see https://stackoverflow.com/questions/24775648/element-wise-logcial-or-in-pand... or http://stackoverflow.com/a/19581644/2570866 This means that this (which selects the rows that either are lower than three or equal to five): df[(df < 3) | (df == 5)] Can be rewritten to this: df[df < 3 || df == 5] This is clearly little advantage, but it also means that there is no need to override the binary or and and. That way they can be used for their original purpose. For Numpy the case is again like with SQLAlchemy see : https://docs.scipy.org/doc/numpy/reference/generated/numpy.logical_and.html and https://docs.scipy.org/doc/numpy/reference/generated/numpy.logical_or.html I hope this made the use for the overloadable logical and and or operators clear. The not operator might be a bit less useful and I'm not sure it's needed as much. Currently SQLAlchemy and Pandas overload the "~" (invert) operator and Numpy has a function again: https://docs.scipy.org/doc/numpy/reference/generated/numpy.logical_not.html Lastly, for SQLAlchemy an in operator that does not return a boolean could also be useful. I can't think of use cases for the others though and I also can't directly think of an operator that would be as clear as the &&, || and ! operators. As for the PEP, I have no problem writing one if this is accepted as a useful addition. Also any suggestions and critiques are very welcome of course. On Monday, 23 November 2015 21:24:23 UTC+1, Emanuel Barry wrote:

On Tue, Nov 24, 2015 at 10:49 AM, Jelte Fennema <me@jeltef.nl> wrote:
I think it's reasonable, except for the potential confusion of having *three* "and" operators. The one with the word is never going to change - its semantics demand that it not be overridable. When should you use & and when &&? Judging by how @ has gone, I think the answer will be simple: "Always use &, unless the docs for some third-party library say to use &&", in which case I think it should be okay. ChrisA

I honestly think the added confusion makes it a non-starter. It's also confusing that in other languages that have && and ||, they are shortcut operators, but the proposed operators here won't be. And the real question isn't "when to use & vs. &&", it's "when to use 'and' vs. &&". On Mon, Nov 23, 2015 at 4:08 PM, Chris Angelico <rosuav@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

This confusion could quite simply be solved by just not implementing the operations on the standard types, even though they would be trivial to implement. This just leaves the possibility for library developers to do something useful with the operators, like with the new @ operator. On 24 November 2015 at 01:13, Guido van Rossum <guido@python.org> wrote:

I honestly think the added confusion makes it a non-starter.
Coming from my experience from the Numpy world, the fact that you get "rich comparisons" for most what seem like Boolean operators, but not for and and or is very confusing to newbies. Much of the time, you can use the bitwise operators, as you often have done a comparison first : (A < x) & (A > y) But it's kind of a coincidence that it works, so it only makes thing more confusing for newbies. Bitwise operators really are kind of obscure these days. Explaining that you use ".and." Instead of " and" would be a much lighter lift than getting into the whole explanation for why we can't overload "and". -CHB I think first we decide it is or isn't a good idea, and then decide how to spell it, but .and. and .or. kind of appeal to me.

On 24 November 2015 at 11:38, Chris Barker - NOAA Federal <chris.barker@noaa.gov> wrote:
I think first we decide it is or isn't a good idea, and then decide how to spell it, but .and. and .or. kind of appeal to me.
I think it's reasonably clear that rich logical operators with appropriate precedence and non-shortcircuiting behaviour would be a nice feature to have, the question is whether or not they can be introduced without making things even more confusing than they already are. Using placeholder syntax, let's consider the three operations: A rich_and B A rich_or B rich_not A To explain this fully will require explaining how they differ from: A and B -> A if not bool(A) else B A or B -> A if bool(A) else B not A -> not bool(A) and: A & B -> operator.and_(A, B) A | B -> operator.or_(A, B) ~A -> operator.not_(A) Depending on a user's background, it will also potentially require explaining how they differ from these operations in C and other languages: A && B A || B !A Casting any new forms specifically as "matrix" operators (like the new matmul operator Jelte referenced in the opening message of the thread) also runs into problems, since "@" is a genuinely distinct operation only applicable to matrices, while the goal here is instead to broadcast an existing operation over the array elements, which matrix objects are already able to do implicitly for most binary operators. Something that *could* potentially be comprehensible is the idea of allowing "elementwise" logical operators, with a suitable syntactic spelling. There'd still be a slight niggle to explain why and/or/not have explicitly elementwise variants when other binary operations don't, but that would likely still be less confusing than explaining the use of the bitwise operators. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 24 November 2015 at 16:05, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
While I'm still largely of the view that introducing additional operators would make things more confusing rather than less, I'm also convinced that if anything like this is going to be pursued without being incredibly confusing for beginners there needs to be a fairly concise answer to "What are these operators for?". Take the "bitwise operators", for example. The notion of a "bitwise operator" is conceptually dense for folks that have never worked with binary numbers before. Despite that, if someone asks "What do the & and | operators do in Python?" the semantics can still be conveyed relatively quickly using some truth table examples like:
Explaining "~" fully is a bit trickier (since you would need to explain why two's complement representations of binary numbers are useful), but it's possible to avoid that explanation by using the alternative arithmetic formulation for "~" given in https://wiki.python.org/moin/BitwiseOperators : "~x == -x -1" Matrix multiplication is another example of something that isn't particularly easy to explain to folks that aren't already familiar with the relevant domain, but also conveys clearly that you can ignore it if you're not working with matrices. It's then also useful to remember that the answers to "What is this for?" and "How is this used?" for a language construct can diverge over time. The original "What is this for?" use cases are the ones that guide the design decisions towards concrete answers that define how the construct works, and provide the underlying rationale for the way the construct behaves. The "How is this used?" cases then arise later when folks say "Yes, those existing semantics are suitable for my current use case, so I can reuse the syntax". Some specific examples: "+" is used not only for addition, but also sequence concatenation. "&" is not only "bitwise and", but also set intersection "/" is not only division, but also pathlib path joining NumPy repurposes most of the binary operators (including the bitwise ones) as element-wise matrix operations. SQL Alchemy repurposes a number of them for SQL query operations. SymPy changes them from arithmetic operations to symbolic ones. Those use cases don't change the answers to "What are these operators for?" from a language design perspective, they only change the answers to "How are these operators used?" from a practical perspective. Getting back to the specific topic of this thread, this could actually make an interesting usability study for a language design theorist, by looking at the kinds of mistakes folks make trying to learn elementwise logic operations in NumPy, and then seeing whether the introduction of overridable elementwise logical operators reduces the learning curve. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

While I'm still largely of the view that introducing additional operators would make things more confusing rather than less,
Well, almost by definition, more stuff to understand is more confusing for beginners.
there needs to be a fairly concise answer to "What are these operators for?".
I don't think "they are for doing logical operations on each of the elements in a sequence, rather than the sequence as a whole", along with an example or two is particularly challenging. In fact, much less so than the Bitwise operators, or matrix multiplication, which require a bit of domain knowledge, as you say. But those aren't a big problem either: "if you don't know what it means, you probably don't need it" But as for general element-wise operators: IIRC, this was discussed a lot back in the day -- and was driven by experience with e.g. Matlab, where the regular math operators do linear algebra by default, and there are alternative "element wise" operators. Numpy, on the other hand, does element-wise by default, so we wanted another set for linear algebra. However, we came to realize that the only one really needed was matrix multiply -- and thus the new @ operator. This all worked because Numpy could overload the math operators to be element wise, and once rich comparisons were implemented, that covered almost everything. So all that's left is and-or. Add the fact that use the Bitwise & and | in their place in most cases, and we've done fine so far. All that being said -- two more operators for "rich and" and "rich or". Would nicely complete the picture. I was just introducing my intro Python class to the magic methods last night -- there are a LOT of them! Two more is pretty trivial Addison of complexity. As long as we can find a way to spell them that is not too confusing or ugly -- I think it's a win-win. Note: having worked with array-oriented languages/libraries for a long time, I'd like element-wise operators that worked with all the built-in types. But I suspect that Python is never going to go there. So we only need these two. -CHB

On 26 November 2015 at 09:58, Chris Barker - NOAA Federal <chris.barker@noaa.gov> wrote:
Right, that's why I think "elementwise logical operators (which may potentially be useful for other things)" is an idea that has some hope of avoiding creating new barriers to learning: * if "elementwise" doesn't mean anything to you, and no library you're using mentions them in its documentation, you can ignore them * for folks that do know what it means, "elementwise" is evocative of the relevant semantics for at least the data analysis use case
But as for general element-wise operators:
I wasn't suggesting those - just element-wise logical operators. In terms of scoping the use case: bitwise and/or/not work fine for manipulating existing boolean masks, and converting a single matrix to a boolean mask involves apply an elementwise function, not elementwise logical operations. So elementwise logical operators would presumably be aimed at *data merging* problems - creating a combined matrix where some values are taken from matrix A and others from matrix B, based on the truthiness of those values. I'm not enough of a data analyst to know how common that problem is, or whether it might be better served by a higher level "replace_elements" operation that accepts a base array, a replacement array, and a boolean mask saying which values to replace. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Nov 25, 2015, at 21:44, Nick Coghlan <ncoghlan@gmail.com> wrote:
But "elementwise" isn't what people doing symbolic computation or most other uses of DSL/expression-tree libraries are doing. Even for ORMs and other query-based libraries like AppScript, where arguably it is what they're doing, they probably aren't thinking of it that way, and wouldn't recognize that it should mean something to them, much less that it's what they're looking for. So I think this is effectively less general/useless than a solution that just allows overloading boolean operators somehow, without adding a distinction between elementwise (and overloadable) and objectwise (and not).
When I use NumPy, sometimes I'm doing GPU-ish stream operations, which need things like compacting select, which aren't obviously expressible in boolean terms, so I end up looking for methods for everything rather than operators even when they might make sense. But otherwise, when I'm doing more typical NumPy stuff (or at least what I think is more typical, but I could easily be wrong), I look for elementwise operators all over the place, including abusing the bitwise operators when it makes sense, so I probably would use real boolean operators if it were more obvious/readable.

Andrew Barnert via Python-ideas wrote:
But "elementwise" isn't what people doing symbolic computation or most other uses of DSL/expression-tree libraries are doing.
Right. I think just describing them as "overloadable versions of the boolean operators" would be best. What they mean is up to the types concerned, just as with all the oher operators. -- Greg

On 27 November 2015 at 06:54, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Except that *isn't* what we do with the other operators. "&" is the bitwise and operator, for example - that's why the special method is called "__and__". The fact you can use it for an elementwise bitwise and operation on NumPy arrays, or for set intersection, isn't part of the core design. If there isn't *at least one* specific motivating use case, then "it might be useful for something" isn't a good reason to add new syntax. However, looking again at PEP 335, I'm not sure I see any reason it needs to noticeably slower in the standard case than the status quo (I'm not saying it would be *easy* to retain the speed, but the complexity would be in the eval loop implementation and the code generation process, not user code). We also have the richer benchmark suite these days to actually quantify the impact of checking for the new __and1__/__or1__ slots before falling back to __bool__, and tracing JIT's would still be able to generate appropriate code for the fast path at runtime. So perhaps it might be worth dusting off that original idea and seeing what the impact is on the performance benchmarks? Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Nov 26, 2015, at 18:11, Nick Coghlan <ncoghlan@gmail.com> wrote:
Sure. But the sqlanywhere case was the very first motivating use the OP mentioned, before NumPy, not something that may come up in the future that we haven't imagined yet. Also, the distinguishing thing about this new magic method vs. the existing __and__ isn't that it's elementwise, but that it's boolean/logical rather than bitwise/arithmetic, even for NumPy users. So, calling it "elementwise and", or giving it a name that implies elementwise, will confuse anyone who hasn't read this whole thread. And finally, NumPy is one of the uses that doesn't require short circuiting, and the same is almost certainly true for other elementwise uses, and yet we seem to all be agreed that the new overload has to be short-circuitable.

On 2015-11-27 09:28, Andrew Barnert via Python-ideas wrote:
Do we? I don't. I agree that if we add a way to overload the existing and/or to support these new usages, then that has to be short-circuitable. because and/or currently are short-circuitable and we can't get rid of that. But to me one of the attractive aspects of this new proposal is that the new operators need not be short-circuitable, which would avoid the various contortions required in a scheme like PEP 335 and thus greatly simplify the overloading. In other words the whole point of these new operators would be to do and-like and/or or-like operations that definitely do want both of their arguments all the time (such as elementwise operations or combining abstract query objects like in these SQL cases). -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

Nick Coghlan writes:
Well, for the social science data analysis I do, that would be inappropriate. Variables from different sources are different variables, you wouldn't just "or" them into a single column of a data frame. You would want your data model to account for the fact that even if they purport to measure the same factor, they're actually different indicators. But for a completely different kind of data, images, that sounds a lot like (Duff's?) compositing operations. But those are a lot more flexible than just "and" and "or": color images are "fuzzy" logic, and so admit many more logical operations (eg, "clamped sum", "proportional combination", etc. I'm not sure how that would fit here, since Python has only a limited number of operator symbols, fewer than there are compositing operations IIRC. Steve

Regarding the short-circuiting `and` and `or`, I think there's a way we can have our overloading-cake and only eat the pieces we need too. Even though it might not be totally intuitive, it should be possible to make the overloaded method get a 0-argument function instead of a value. In that case, `a and b` would become `a.__land__(lambda: b)`. As far as performance goes, there probably would have to be some special casing that first checks if `a` has an overloaded `and`, and if not use the default behaviour. The default implementation: class object: def __land__(self, other_f): if not self: return self else: return other_f() def __lor__(self, other_f): if self: return self else: return other_f() And for some python-expressions-to-AST nodeclass: class BaseExpression: def __land__(self, other_f): return AndExpression(self, other_f()) def __lor__(self, other_f): return OrExpression(self, other_f()) Again, the fact that the second argument to `__land__`/`__or__` is a function might be a bit confusing, but that's probably going to be the only way to make short-circuiting work for an overloaded and/or without going the route of lazy evaluation. On Mon, Nov 23, 2015 at 04:13:42PM -0800, Guido van Rossum wrote:

On Mon, Nov 23, 2015 at 4:08 PM, Chris Angelico <rosuav@gmail.com> wrote:
I think it's reasonable, except for the potential confusion of having *three* "and" operators.
I think using && and || would be an attractive nuisance for people switching from another programming language. Right now if I accidentally right && in Python or "and" in another language, I get an immediate syntax error. With this proposal, I get unexpected results. If this idea were to fly, a better name would be something that doesn't have that problem, e.g., .and. .or. .not. I don't want to bikeshed the exact syntax**, but I think it should be clear that with something like this: (1) no one is going to accidentally type them and (2) they are pretty clearly some variation of the standard and/or/not. **Lots of other possibilities that are syntax errors right now: @and, (and), etc. I like .and. because it's less visual clutter and it's easy to type. --- Bruce Check out my puzzle book and get it free here: http://J.mp/ingToConclusionsFree (available on iOS)

On 24 November 2015 at 01:26, Bruce Leban <bruce@leban.us> wrote:
I think a naming scheme like that is indeed a good way to solve the confusion issues, since it is also immediately clear that these would be a special version of the normal operators. Another advantage is that it could also be extended to the in operator, if that one is to be included. I'm not sure I like the dots version very much though, but like you said there are lots of syntax error options to choose from. On 24 November 2015 at 01:26, Bruce Leban <bruce@leban.us> wrote:

Why hasn't SQLAlchemy gone the route of NumPy with overloaded operators? Perhaps whatever reason it is would prevent using any new operators as well. With NumPy I make that mistake constantly: A == a & B == b rather than (A == a) & (B == b) I'd put that in the category of parentheses tax along with the print function and old style % string interpolation. Annoying, but it's inappropriate to use a gun to swat a fly. (In case my metaphor is unclear, creating a new operator is the gun -- risking collateral damage and all that) As Guido said, the real usability problem is that the ``and`` operator is a new Python programmer's first instinct. Adding yet another operator would make Python harder to learn and read. Even if you advertise a new operator, many libraries will be slow to change and we'll have 3 different techniques to teach. Let's weigh the benefits against the negative consequences. On Mon, Nov 23, 2015 at 8:12 PM Jelte Fennema <me@jeltef.nl> wrote:

On 24 November 2015 at 02:44, Michael Selik <mike@selik.org> wrote:
Why hasn't SQLAlchemy gone the route of NumPy with overloaded operators?
It seems I was wrong about that, they apparently do: http://stackoverflow.com/a/14185275/2570866 I'd put that in the category of parentheses tax along with the print
function and old style % string interpolation.
This seems like a bit of a weird argument since the parentheses for the print function are put there for a reason (see PEP3105) and the old style % string interpolation will be replaced by the new format string literal. Adding yet another operator would make Python harder to learn and read.
Even if you advertise a new operator, many libraries will be slow to change and we'll have 3 different techniques to teach.
I don't think that much confusion will arise, since the normal way is to to use is the short `and` version. Only in libraries where it would be explicitly told the new operator would be used. It would also do away with the confusion about why and cannot be overridden and why the precedence of the & operator is "wrong". On 24 November 2015 at 02:44, Michael Selik <mike@selik.org> wrote:

As I think of people's reactions to seeing & and | for the first time, the typical response is, "What do you mean by bitwise?" Not, "Why are you using bitwise operators for non-bitwise operations?" Interestingly, no one in this thread seems to have a problem with ``&`` and ``|`` for set intersection and union. The primary complaint is that NumPy users instinctively reach for ``and``/``or`` and then forget the operator precedence of ``&``/``|``. On Mon, Nov 23, 2015 at 9:09 PM Jelte Fennema <me@jeltef.nl> wrote:
Using pipe and ampersand looks readable. The use of bitwise operators for overloading and/or/not seems standard for many objects, in standard library and major projects. In fact, I use those operators as logical far more often than as bitwise. I'd bet a great number of NumPy users in the science community are completely unaware of their bitwise effects. I'd put that in the category of parentheses tax along with the print
I picked ``print`` as an example to show that requiring parens is fine. I picked string interpolation as an example because the ``.format`` solution also accepted parens as necessary. The rationale for f-string syntax (as written in the PEP) does not complain about parens but instead mentions that they will be necessary inside the string in several circumstances, like lambdas. Adding yet another operator would make Python harder to learn and read.
That precedence issue doesn't cause problems when teaching, in my experience. People accept readily that there's issues with operator precedence. On the other hand, they have trouble reading code that uses an older style (perhaps using ``&``) instead of the newer style that they learned (the hypothetical ``&&``). Even the switch between ``%`` interpolation and ``.format`` is still causing problems in large organizations.

On 24 November 2015 at 10:14, Michael Selik <mike@selik.org> wrote:
Using pipe and ampersand looks readable. The use of bitwise operators for that people accept the wrong operator precedence readily, even though before you said you do it wrong yourself constantly. This is also why I think the set intersection and union are not a problem is because the operator precedence is correct there. Your point that the problem is mostly that the old code will still be usin the ``&`` operator, which could confuse people is true. But I also think that would eventually disappear, which would make Python better in the future. Another thing is that it seems some people are worried about the form of the new operators. Some options (which don't use the C style &&, because that would indeed cause confusion): .and. *and* +and+ %and% @and@ ?and? ^and^ <and> {and} (and) [and] |and| :and: _and_ (could currently be a variablename) en (the Dutch version) Some of these could also be used with just one special character, like @and, but I think the surrounded ones look more visually pleasing. These are just a couple of examples and some of them seem fine to me. I do think that it is important though to not focus on the form already. It seems better to first figure out if the new operators would not confuse the newcomers to much in whatever form they come. On 24 November 2015 at 10:14, Michael Selik <mike@selik.org> wrote:

On Tue, Nov 24, 2015 at 5:00 AM Jelte Fennema <me@jeltef.nl> wrote:
I make that mistake in an interactive environment and fix it moments later, so it's not a big thing for me. I also occasionally forget to put a colon at the end of my for-loops, etc. ;-)
What's the half-life of deprecated code? That stuff is like nuclear waste. Or more like a bacteria you're spot-treating with antibiotic. It keeps replicating while you apply ointment and might evolve a resistance. Ok, ok, I'm getting a little too colorful there.

agreed -- but unfortunately, numpy may be one place where people really do want bitwise operations sometimes -- and it's way too late now to re-define them anyway.
The primary complaint is that NumPy users instinctively reach for them, or will see them in code and think they ARE another way to spell and and or. It also doesn't allow element-wise short circuiting behavior. you get: In [*2*]: np.array([0,1,2,3]) & np.array([3,2,1,0]) Out[*2*]: array([0, 0, 0, 0]) when you want: In [*3*]: [ a and b for a,b in zip([0,1,2,3], [3,2,1,0])] Out[*3*]: [0, 2, 1, 0] oh, and this: In [*4*]: np.array([0,1,2,3]) | np.array([3,2,1,0]) Out[*4*]: array([3, 3, 3, 3]) would surely confuse people that don't "get" what bitwise means, even though it results in teh same thing in a boolean context: In [*5*]: (np.array([0,1,2,3]) | np.array([3,2,1,0])).astype(np.bool) Out[*5*]: array([ True, True, True, True], dtype=bool) Honestly, I'm not sure element-wise short circuiting is desired, but it would be nice to be consistent about that. As for precedence, it's a annoying, but at least in my experience, the very first test case fails, and then I remember to add the parens. As for the more and more operators issue -- for the most part, this would be used for special purposes -- most people wouldn't even notice there were there, and numpy users might well think it' s special numpy thing (same for SQLAlchemy folks, or...) Unless folks wanted to add support for element-wise and and or to the built-in sequence types. I don't think anyone has proposed that. Your point that the problem is mostly that the old code will still be using
the ``&`` operator, which could confuse people is true.
well, yes - but code using & would continue to work the same way -- and anyone confused would, in fact, have been mis-interpreting what & meant already :-) As for element-wise everything -- THAT discussion has happened multiple times in the past, and been rejected. Time to go dig into the archives -- not sure if there is rejected PEP for it. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Mon, Nov 23, 2015 at 5:44 PM, Michael Selik <mike@selik.org> wrote:
It's not just the operator precedence that's an issue here -- this also means something different than what people expect. As it happens a bitwise and works "just like" and when the values are boolean (like above), but that isn't always the case -- confusion waiting to happen. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

(Top-posting because this is really a reply to a combination of three earlier posts, not to Greg Ewing's post, except at the very end.) -1 on "&&" and "||". To anyone familiar with C and friends, it seems like they ought to be like C (short-circuiting, and more generally the same thing we already spell "and" and "or"); to anyone else, it would make no sense for them to have different precedence than "&" and "|". -0.5 on ".and." and ".or." for multiple reasons, most of which apply just as well to anything similar: * That looks like a general custom-infix syntax. Anyone coming to Python (including Python 2 users) will expect that they can just as easily define ".spam." (and then be disappointed that they can't...), or will be worried that 3.7 will add a bunch of new dot operators they'll have to learn. Not a _huge_ negative, but already enough to turn me off the idea. * What would you call the dunder methods (and operator module functions), and how would you deal with the fact that any novice/transplant is going to assume "__and__" means ".and." rather than "&"? * Operators starting with "." are ambiguous with float literals and/or attribute access ("spam.and.eggs" looks like a member of a member of spam), at least to humans, if not to the parser. * Operators starting and/or ending with "." are hard to talk about because of natural-language punctuation (and even harder on a mobile keyboard, or an overly clever text editor or word processor). -0 on actually adding a general custom-infix syntax, come to think of it. Any solution to the problems above should work just as well here. And it means that in the future, libraries don't have to cram things into inappropriate symbols or be stuck with prefix or dot-method notation. And I don't think it would be any harder to learn than a few special cases. And I think ".between." would be just as useful in an ORM as ".in.". More generally, the whole point of "@" was that everyone agreed that it was the only new operator anyone would need (except maybe "@@") for a decade or two; if that's not true, infinity is a better number than 4. +1 on the idea if someone can come up with a good spelling that avoids all the above problems and reads as naturally as "@".
Is that last one supposed to be "niet"? When I try "nlet", Google assumes it's a typo for "net", which I guess could be a unary boolean identity operator, but I don't think we need that. :) Anyway, that pattern is a bit hard to extend to an overloadable "in" operator, because Dutch for "in" is "in".

On 24 November 2015 at 06:35, Andrew Barnert via Python-ideas < python-ideas@python.org> wrote:
This seems like an interesting idea, but I think it would be hard finding a notation that does not conflict with current operations. For the and/or almost any symbol would work, since they are keywords and the operators have no meaning on them, but if any name can be used operations on variables will become unclear. So another symbol would probably have to be used that isn't used yet, like the $ symbol. Also, I think there are probably some other issues that I haven't thought of, since this would add a pretty big language feature. PS. I didn't include the dollar convention in my last options list for the new and operator, but it is ofcourse also a possibility: $and$, or maybe $and. On 24 November 2015 at 06:35, Andrew Barnert via Python-ideas < python-ideas@python.org> wrote:

On Nov 24, 2015, at 02:10, Jelte Fennema <me@jeltef.nl> wrote:
The obvious notation is "a `spam` b", as used in Haskell and various other languages. It doesn't conflict with current operations. It doesn't look like a pair of operators, but like a matched bracketing or quoting, which is exactly what we'd want. It has just about the right screen density (. can be easy to miss, while $ is so heavy that it makes the "in" in $in$ harder to read). It doesn't conflict with a different meaning elsewhere, like $(in), which can look like part of string template to a human reader. Also, experience with those other languages shows that it works. In particular, in Haskell, I could always just define any new string of symbols to call spam on its operands, and with any precedence and associativity I want--but it's often much more readable to just use `spam` instead. Its biggest problem is that Guido hates backticks, and is on record as promising that they will never be reused in Python now that they no longer mean repr. Since I don't think he'd accept the idea anyway (IIRC, last time it came up, he said it wasn't even worth writing a PEP to categorically dismiss), that makes this the perfect syntax for it. :) More seriously, my point that infinity is a better number than 4 is more an argument against adding custom and, or, not , and in operators than an argument for adding custom arbitrary operators, which means the spelling isn't that important anyway. I don't think you're going to do much better than `and`, and I don't think that's good enough for Python.
For the and/or almost any symbol would work, since they are keywords and the operators have no meaning on them,
No, because code has to be readable and parseable by humans, not just by compilers. Even if the compiler can tell that "spam.and.eggs" or "2.in.e" aren't using dots for member access or float purposes, that isn't clear to a human reader until you look at it carefully. (Of course you can always use `and` or some other syntax that wouldn't be confusing even if and weren't a keyword, but at that point you're back to "no better than custom infixes".) Also, bear in mind that what you're competing with is a.spam(b), so anything that looks too heavyweight or too weird, like a $(spam) b, isn't going to be much improvement except in very long expressions where closing the parens gets troublesome (which often aren't going to be readable anyway, and maybe it's not so terrible to encourage people to break them up and name subparts).
but if any name can be used operations on variables will become unclear. So another symbol would probably have to be used that isn't used yet, like the $ symbol. Also, I think there are probably some other issues that I haven't thought of, since this would add a pretty big language feature.
Actually, last time I looked into it (around 3.4), it didn't have any other big issues or complex interactions with other features (and I don't think static typing, await, or any other changes will affect that, but I'd have to look more carefully). And it's pretty simple to implement, and pretty easy to describe. If you're interested, I may have written up a blog post, and if I didn't, I could write one now. (I think I also have a hack that implements it with a retokenizing import hook, if you want to play with it.) While we're at it, every time custom infix operators come up, someone points out that you can already fake it pretty well. For example, "a &And& b" just needs a global named "And" whose __rand__(self, a) returns an object whose __and__(self, b) does what you want. And this allows you to use whatever precedence you want by picking the surrounding operators (much like Swift's custom infix operators get their precedence from the first symbol in the name), and lets you implement operator sectioning (you can make __rand__(self, a).__call__ the same as its __and__, and then you can pass around "a &And" as a partial function), and so on. There are a few glitches, but it works well with the kind of code you'd write with an ORM. And again, this is just as true for a proposal to add four custom operators as for a proposal to add a general feature. So, you have to think about why sqlanywhere, numpy, etc. have decided not to use this trick so far, when it's been well known since around the days of Python 2.4, and why they'd be better off with custom operators that don't look that different from the fake ones.

On 2015-11-24, Chris Angelico <rosuav@gmail.com> wrote:
How? Are you referring to the short-circuit? C# allows overloading the short-circuit operators by doing it in two parts - in pseudocode: a and b: a if a.__false__() else a & b a or b: a if a.__true__() else a | b There's no fundamental reason short-circuiting "demands" that it not be overridable, just because C++ can't do it.

On 2015-11-23 22:34, Random832 wrote:
The problem is that this kind of overriding doesn't handle the main use case, which is elementwise and-ing/or-ing. If a is some numpy-like array of [1, 1], then it may be boolean true, but you still want "a magic_or b" to do the elementwise operation, not just return "a" by itself. (I say "numpy-like" because a real numpy array will fail here precisely because its boolean truth value is undefined, so it raises an error.) The problem is that, with the existing and/or, the issue of the boolean truth values of the operands is entangled with the actual and/or operation. For an elementwise operation, you *always* want to do it elementwise, regardless of whether the operands "count" as boolean false in some other situation. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

Brendan Barnwell wrote:
Maybe we can hook into __bool__ somehow, though? Suppose 'a and b' were treated as: try: result = bool(a) except IDoNotShortCircuit: result = a.__logical_and___(b) else: if not result: result = b Since a __bool__ call is required anyway, this shouldn't slow down the case where there is no overriding. -- Greg

On 2015-11-24 21:24, Greg Ewing wrote:
That still doesn't deal with the issue of what should happen if the order is reversed, e.g. "numpy_array and simple_bool" vs "simple_bool and numpy_array", where "numpy_array" has the non-shortcircuiting behaviour but "simple_bool" hasn't.

MRAB wrote:
That's true. I guess the only solution that really works properly is to have a second set of operators, or some way of flagging that you're using non-short-circuiting semantics. There's one language I've seen -- I think it was Eiffel, or maybe Ada -- that had two sets of boolean operators. But it was kind of the other way around: 'and' and 'or' were non-short-circuiting, and to get short-circuiting you had to say 'and then' or 'or else'. -- Greg

On November 24, 2015 4:45:45 PM CST, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
That was a lot of languages: - Ada: and then, or else - Algol: ANDTHEN, ORELSE - Erlang: andalso, orelse - Extended Pascal: and_then, or_else - GNU Pascal: and then, or else - Oz: andthen, orelse - Standard ML: andalso, orelse - Visual Basic: AndAlso, OrElse Interestingly, Visual Basic allows you to overload non-short-circuiting And. To overload AndAlso, you need to overload And and IsFalse. It seems that AndAlso was roughly equivalent to And + IsFalse: A AndAlso B = If Not (IsFalse A) Then A And B Else A Endif -- Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.

On Tue, Nov 24, 2015 at 5:34 PM, Random832 <random832@fastmail.com> wrote:
The Python semantics are defined more tightly than that, though - the "else" clause in each case would simply be "b". Changing that is not something you can do with operator overloading, so it would mean a fundamental change in the operator's semantics. And once you do that, you end up with a completely different operator. So, yes, the semantics of Python's short-circuiting 'and' and 'or' operators precludes any form of overriding. Yes, it's possible to have overridable short-circuiting operators, but Python's ones are not those. (Also, I think I prefer the simpler semantics. But that's a matter of personal choice.) ChrisA

On 2015-11-24, Chris Angelico wrote:
I don't see that as "fundamental". Certainly it can't _actually_ be a & b, but it could certainly be "a __foo__ b" where the default implementation of the __foo__ method (on object) simply returns b. If that's "fundamental" then adding overloading to _any_ operator that didn't support it in Python 1.0 violates those operators' "fundamental" behavior of raising a TypeError when applied to types they did not support in Python 1.0. This would also allow b.__rfoo__, to address the objection Nathaniel Smith raised, of the right-hand-side not being able to override. It can't override "does it short-circuit", since that'd defeat the point of short-circuiting, but it can override the value in the case where it doesn't, and a well-behaved 'elementwise-"and"able' type would work either way, only short-circuiting if every element of the left side is false (or true for "or").

On Nov 23, 2015, at 22:34, Random832 <random832@fastmail.com> wrote:
Actually, it _is_ possible in C++. You can pretty much do anything in C++, as long as you're an expert, you don't care about anyone reading your code, and you don't mind 1970s-style build times, and this is no exception. You just write an expression template library that uses non-short-circuiting-at-compile-time operators to generate functions that are short-circuiting at runtime, and then add the boilerplate to wrap any constants or normal variables up into your expression template types, and you're set. A simpler solution is to use lambda lifting, like Swift: "a and b" just means "a.__and__(lambda: b)". You could also split the operator into two method calls, without combining with bool conversion a la C#. So "a and b" becomes "tmp = a.__and1__()", then "tmp.__and2__(b) if tmp2 else tmp2". That way, "and" can return a non-boolean value, just as it does when not overloaded. I can't think of any language that does this off the top of my head, but I'm sure they exist. Or, in languages with macros or equivalent, you just use a macro instead of a function. Or, in languages with lazy evaluation by default, it's even simpler--if you don't use the result of "b" anywhere, it never gets evaluated.

On Nov 23, 2015 22:34, "Random832" <random832@fastmail.com> wrote:
No, I disagree -- for us short circuiting makes overriding extremely difficult, probably impossible, because in python in general and in the use cases being discussed here in particular, the right-hand side argument should have a chance to overload. It's going to be brutal on newbies and general code comprehensibility if True and array_of_bools or True and sqlalchemy_expression act totally differently than the mirrored versions, but this is an unavoidable if you have both short circuiting and overloading. (Example of where you'd see code like this: if restrict_user_id is not None: user_id_constraint = (MyClass.user_id == restrict_user_id) else: user_id_constraint = True if restrict_month is not None: month_constraint = (MyClass.month == restrict_month) else: month_constraint = True ... return query(MyClass).filter(user_id_constraint and month_constraint and ...) Yes, this could be written differently, but this pattern comes up fairly often IME. And it's otherwise very reliable; in numpy in particular 'True op array' and 'array(True) op array' act identically for every overloaded binary operator.) -n

On 2015-11-23 22:34, Random832 wrote:
The problem is that this kind of overriding doesn't handle the main use case, which is elementwise and-ing/or-ing. If a is some numpy-like array of [1, 1], then it may be boolean true, but you still want "a magic_or b" to do the elementwise operation, not just return "a" by itself. (I say "numpy-like" because a real numpy array will fail here precisely because its boolean truth value is undefined, so it raises an error.) The problem is that, with the existing and/or, the issue of the boolean truth values of the operands is entangled with the actual and/or operation. For an elementwise operation, you *always* want to do it elementwise, regardless of whether the operands "count" as boolean false in some other situation. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

To everyone claiming that you can't overload and/or because they are shortcut operators, please re-read PEP 335. It provides a clean solution -- it was rejected because it adds an extra byte code to all code using those operators (the majority of which don't need it). -- --Guido van Rossum (python.org/~guido)

On Nov 24, 2015 10:21 AM, "Guido van Rossum" <guido@python.org> wrote:
The semantic objection that I raised -- short circuiting means that you can't correctly overload 'True and numpy_array', because unlike all other binops the overload must be defined on the left hand argument -- does apply to PEP 335 AFAICT. This problem is IMHO serious enough that even if PEP 335 were accepted today I'm not entirely sure that numpy would actually implement the overloads due to the headaches it would cause for teaching and code review -- we'd have to have some debate about it at least. (Possibly useful analogy: Having to always double check that the array argument appears on the left rather than the right is kinda like how the old 'a and b or c' idiom forces you to constantly keep an extra special rule in the back of your head and always double check that b cannot be falsey whenever you read or write it, which IIUC was one of the major reasons why it was considered an insufficient substitute for a real ternary operator.) -n

On Nov 24, 2015 11:53 AM, "Guido van Rossum" <guido@python.org> wrote:
We'd certainly love it if there were a better alternative, but -- speaking just for myself here -- I've hesitated to wade in because I don't have any brilliant ideas to contribute :-). The right-hand-side overload problem seems like an inevitable consequence of short-circuiting, and we certainly aren't going to switch 'and'/'or' to become eagerly evaluating. OTOH none of the alternative proposals mooted so far have struck me as very compelling or pythonic, if only because all the proposed spellings are ugly, but, who knows, sometimes something awesome appears deep in these threads. Two thoughts on places where it might be easier to make some progress... - the 'not' overloading proposed in PEP 335 doesn't seem to create any horrible problems - it'd be a minor thing, but maybe it's worth pulling out as a standalone change? - the worst code expansion created by lack of overloading isn't a == 1 and b == 2 becoming (a == 1) & (b == 2) but rather 0 < complex expression < 1 becoming tmp = complex expression (0 < tmp) & (tmp < 1) That is, the implicit 'and' inside chained comparisons is the biggest pain point. And this issue is orthogonal to the proposals that involve adding new operators, b/c they only help with explicit 'and'/'or', not implicit 'and'. Which is why I was sounding you out about making chained comparisons eagerly evaluated at the bar at pycon this year ;-). I have the suspicion that the short-circuiting semantics of chained comparisons are more surprising and confusing than they are useful and we should just make them eagerly evaluated, and I know you have the opposite intuition, so I think the next step here would be to collect some data (run a survey, scan some code, ...?) to figure out which of us is right :-). The latter idea in particular has been on my todo list for at least 6 months and still has not bubbled up near the top, so if anyone is interested in pushing it forward then please feel free :-). -n

On 26 November 2015 at 06:42, Nathaniel Smith <njs@pobox.com> wrote:
Regardless of which is more useful it would be a very subtle backwards compatibility break. I imagine that code that relies on the short-circuit here is rare. I'm confident that it is much rarer than numpy-ish code that works around this in the way you showed above or just by evaluating the complex expression twice as in (0 < complex expression) & (complex expression < 1) which is what I normally write when the expression isn't too long. But there's guaranteed to be some breakage and a good migration path to mitigate that is unclear. You could add a __future__ import and a warning mode to detect when short-circuiting happens in chained comparisons. Unfortunately the naive implementation of the warning mode would simply trigger on 50% of chained comparisons (every time the left hand relation is False). I would definitely use chained comparisons for numpy arrays if it were possible but at the same time I don't think the status quo on this is that bad. It's a bit of a gotcha for new numpy users to learn but numpy is good at giving the appropriate error messages: >>> from numpy import array >>> 1 < array([1, 2, 3]) < 2 Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() >>> 1 < array([1, 2, 3]) and 3 Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() If you google that error message there's lots of SO posts etc. that can explain in more detail. It's also worth noting since we're comparing with Matlab that Matlab actually has both short-circuit logical && and element-wise logical & equivalent to Python's and/& respectively. And Matlab doesn't have chained comparisons or rather they don't do anything nearly as useful as Python's chained comparisons i.e. in Matlab -2 < -1 < 0 is evaluated as: -2 < -1 < 0 (-2 < -1) < 0 1 < 0 0 (i.e. False) which is basically useless and doesn't even give a decent error message like numpy does. -- Oscar

On Wed, Nov 25, 2015 at 10:42 PM, Nathaniel Smith <njs@pobox.com> wrote:
This seems pretty harmless, and mostly orthogonal to the rest -- except that overloadable 'not' is not very attractive or useful by itself if we decide not to address the others.
What to do with chaining comparisons is a very good question, but changing them to be non-short-circuiting would cause a world of backwards incompatible pain. For example, I've definitely written code where I carefully arranged the comparisons so that the most expensive one comes last (in hopes of sometimes avoiding the time spent on it if the outcome is already determined). This is particularly easy with chained ==, but you can sometimes also change a < b < c into c > b > a, if a happens to be the expensive one. However, note that PEP 335 *does* address the problem of chained comparisons head-on -- in a < b < c, if a < b returns a numpy array (or some other special object that's not falsey), it will then proceed to compute b < c and combine the two using the overloading of the default 'and'; because the first result is not a simple bool, the problem you described with `True and numpy_array` does not apply. So, maybe you and the numpy community can ponder PEP 335 some more? Honestly, from the general language design POV (i.e., mine :-), PEP 335 feels more acceptable than introducing new non-short-circuit and/or operators. How common would the `True and numpy_array` problem really be? I suppose any real occurrences would not use the literal True; `True and x` is just a wordy way to spell x, and doing this element-wise would just return the array x unchanged. (Or would it cast the elements to bool? That still feels like a unary operator to me that deserves a more direct spelling.) I don't see a use case for a literal left operator in symbolic algebra or SQL either. But let's assume we have some scalar expression that evaluates to a bool. Even then, `x and numpy_array` feels like a clumsy way to spell `numpy_array if x else <an array of the same shape filled with False>`. But I suppose you've thought about this more than I have. :-) -- --Guido van Rossum (python.org/~guido)

On 2015-11-24, Nathaniel Smith wrote:
The semantic objection that I raised -- short circuiting means that you can't correctly overload 'True and numpy_array',
Well, you mean "True or" or "False and", but anyway... I've got to confess, I don't really understand the semantic paradigm under which it's appropriate to do this all, but it's not appropriate for it to return True. If you can pass in True here, that implies that the True you'll get out of this operation is an appropriate value for using in further operations on similarly-shaped arrays. What does it _matter_ that you get True (or 1, etc) instead of [True, True, True, True, True, True]?

What does it _matter_ that you get True (or 1, etc) instead of [True, True, True, True, True, True]?
Sometimes you want to know that everything in an array is True ( which is common enough that numpy.alltrue() exists) But other times you want to know which items in an array a true, and which are not -- most often to use as a Boolean mask -- in which case, an array that happens to have all true values is a different beast altogether from a single True. And remember that thee could be arrays of any number of dimensions. Also: Python Truthyness rules define an empty container as False, and any non-empty container as True -- it's probably better not to make arrays unique in that regard. -CHB

On 2015-11-24, Brendan Barnwell wrote:
Why? I am asking specifically because *all* elements of a are true, so all elements of the result will also be true, and taken from a. Clearly [0, 1] should *not* return true for a.__true__(), which is why I didn't use __bool__. But the elementwise operation [1, 1] magic_or [0, 2] returns [1, 1].

On 2015-11-24 11:07, Random832 wrote:
I guess I'm not understanding what you mean by __true__ then. What is this __true__ for which ([0, 1]).__true__() is not true? -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

On 2015-11-24, Brendan Barnwell wrote:
I guess I'm not understanding what you mean by __true__ then. What is this __true__ for which ([0, 1]).__true__() is not true?
It'd be a new operator / magic-function, so whatever the person writing the class wants it to be. I specifically didn't use bool(). def __true__(self): # return True ony if you want "or" to short-circuit pass
participants (19)
-
Andrew Barnert
-
Brendan Barnwell
-
Bruce Leban
-
Chris Angelico
-
Chris Barker
-
Chris Barker - NOAA Federal
-
Emanuel Barry
-
Greg Ewing
-
Guido van Rossum
-
Jelte Fennema
-
Michael Selik
-
MRAB
-
Nathaniel Smith
-
Nick Coghlan
-
Oscar Benjamin
-
Random832
-
Ryan Gonzalez
-
Sjoerd Job Postmus
-
Stephen J. Turnbull