PEP 355 (overloading boolean operations) and chained comparisons

I had a chance to speak to Travis Oliphant (NumPy core dev) at PyCodeConf and asked him his opinion of PEP 355. His answer was that he didn't really care about overloading boolean operations in general (the bitwise operation overloads with the appropriate objects in the arrays were adequate for most purposes), but the fact that chained comparisons don't work for NumPy arrays was genuinely annoying. That is, if you have a NumPy array, you cannot write: x = A < B < C Since, under the covers, that translates to: x = A < B and B < C and the result of the first operation will be an array and hence always true, so 'x' receives the value 'True' rather than an array with the broadcast chained comparison. Instead, you have to write out the chained comparison explicitly, including the repetition of the middle expression and the extra parentheses to avoid the precedence problems with the bitwise operators: x = (A < B) & (B < C) PEP 355 would allow NumPy to fix that by overriding the logical 'and' operation that is implicit in chained comparisons to force evaluation of the RHS and return the rich result. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Oct 12, 2011, at 5:39 PM, Nick Coghlan wrote:
Have you considered that what-is-good-for-numpy isn't necessarily good for Python as a whole? Extended slicing and ellipsis tricks weren't so bad because they were easily ignored by general users. In contrast, rich comparisons have burdened everyone (we've paid a price in many ways). The numeric world really needs more operators than Python provides (a matrix multiplication operator for example), but I don't think Python is better-off by letting those needs leak back into the core language one-at-a-time. Raymond

On Thu, Oct 13, 2011 at 2:43 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
Yeah, I'm still almost entirely negative on PEP 355 (the discussion of it only started up again because I asked Guido if we could kill it off officially rather than leaving it lingering in Open status indefinitely). I just thought the chained comparisons case was worth bringing up, since the PEP doesn't currently mention it and it's quite a subtle distinction that you can overload the binary operators to create a rich comparison operation but this overloading isn't effective in the chained comparison case due to the implicit 'and' underlying that syntax. Overall, PEP 355 still seems to be trying to swat a gnat with a sledgehammer, and that's the perspective of someone that has a long history of trying to take out language gnats with sledgehammers of his own ;) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, Oct 13, 2011 at 12:43 AM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
Have you considered that what-is-good-for-numpy isn't necessarily good for Python as a whole? Extended slicing and ellipsis tricks weren't so bad because they were easily ignored by general users. In contrast, rich comparisons have burdened everyone (we've paid a price in many ways).
Rich comparisons added complication to Python, but was a very worthwhile feature. In addition to numpy, they are used by packages like sqlalchemy and sympy to give a more natural syntax to some operations. The introduction of rich comparisons also included the introduction of Notimplemented (IIRC), which adds even more complication but makes it possible to write more powerful code. __cmp__ also had a somewhat odd (though not unique) API, which I many times saw confuse learners. In any event, I don't think rich comparisons affect most users, who very seldom have an excuse to write a set of comparison operators (using __cmp__ or rich comparisons).
The numeric world really needs more operators than Python provides (a matrix multiplication operator for example), but I don't think Python is better-off by letting those needs leak back into the core language one-at-a-time.
Having used numpy fairly extensively, I disagree. I don't mind having to call a normal function/method like solve, dot, conj, or transpose in circumstances where a language like Matlab would have a dedicated operator. In fact, I could argue for these things that what Python does Python's way is superior to Matlab's in particular, as most of these operators have or are related to problematic features or syntax. I do, however, regularly write "(a < b) & (b < c)" and hate it; a little observation reveals is it quite terrible. That being said, I think the fault might be as much numpy's as anything. An API like b.isbetween(a, c) or even (a < b).logicaland(b < c) would probably be nicer than the current typical solution. Though these fall short of being able to write a < b < c, which would be consistent and obvious, they would perhaps be enough to weaken the idea that a semantic change in Python could be beneficial. I'm still not seeing the great harm this will have on normal Python programmers who don't wish to overload boolean operators. Unlike rich comparisons, which deprecated the standard way to do thins, in this case the developer using Python can do the exact same thing she was doing all along and get the same results. Mike

Not to mention that support of overloadable boolean operations would be very beneficial for some ORMs too. As of now we have the only option to write something like: "ops.And(cond1, ops.Or(cond2, cond3))", or to use operators for bit logic with different precedence. - Yury On 2011-10-19, at 3:41 PM, Mike Graham wrote:

On Oct 19, 2011, at 12:41 PM, Mike Graham wrote:
I'm still not seeing the great harm this will have on normal Python programmers who don't wish to overload boolean operators.
It is harmful. The and/or operators are not currently dependent on the underlying objects. They can be compiled and explained simply in terms of if's. They are control flow, not just logic operators. We explain short circuiting once and everybody gets it. But that changes if short-circuiting only happens with certain inputs. It makes it much harder to look at code and know what does. I'm reminded of the effort to make "is" over-loadable. It finally go shot down because it so profoundly messed with people's understanding of identity and because is would not longer be possible to easily reason about code (i.e. it becomes difficult to assure that simple container code is correct without knowing what kind of objects were going to be stored in the container). In a way, the and/or/not overloading suggestion is worse than rich comparisons because even the simplest "a and b" expression would have to be compiled in a profoundly different way (and the related peephole optimizations would not longer be valid). Everyone would pay the price. Raymond

On Wed, Oct 19, 2011 at 4:25 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote: ..
I would even mention the whitespace overloading "idea": http://www2.research.att.com/~bs/whitespace98.pdf Even in C++, not everything can be overloaded. Some applications are best served by a special purpose language rather than by adding features to a general purpose one.

On Oct 19, 2011, at 3:25 PM, Greg Ewing wrote:
What peephole optimisations are currently applied to boolean expressions?
Here's the comment from Python/peephole.c: /* Simplify conditional jump to conditional jump where the result of the first test implies the success of a similar test or the failure of the opposite test. Arises in code like: "if a and b:" "if a or b:" "a and b or c" "(a and b) and c" x:JUMP_IF_FALSE_OR_POP y y:JUMP_IF_FALSE_OR_POP z --> x:JUMP_IF_FALSE_OR_POP z x:JUMP_IF_FALSE_OR_POP y y:JUMP_IF_TRUE_OR_POP z --> x:POP_JUMP_IF_FALSE y+3 where y+3 is the instruction following the second test. */ Raymond

On 20/10/11 13:16, Raymond Hettinger wrote:
While the existing peephole optimisations wouldn't work as-is, there's no reason that similarly efficient code couldn't be generated, either by peephole or using a different compilation strategy to begin with. There are some comments about this in the draft PEP update that I'll try to get submitted soon. -- Greg

On 19 October 2011 21:25, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
An interesting point is that while the proposal is about overloading the logical operators, many of the arguments in favour are referring to chained comparisons. If the rich comparison mechanisms were somehow extended to cover chained comparisons, would that satisfy people's requirements without needing logical operator overloading? I'm not saying I agree with the idea of overloading chained comparisons either, just wondering if a less ambitious proposal would be of any value. Personally, I have no use for any of this... Paul.

Paul Moore wrote:
Not really -- the matter of chained comparisons was only brought up recently. There's much more behind it than that.
It wouldn't satisfy any of the use cases I had in mind when I wrote the PEP. -- Greg

On Thu, Oct 20, 2011 at 6:25 AM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
Indeed. I actually think adding '&&' and '||' for the binary logical operator purposes described in PEP 355 would be a preferable alternative to messing with the meaning of 'and' and 'or' as flow control expressions. The meaning of chained comparisons could then also be updated accordingly so that "a < b < c" translated to "a < b && b < c" if the result of "a < b" overloaded the logical and operation, but would still short circuit otherwise. I'm not saying I think that's necessarily a *good* idea - I'm just saying I dislike it less than the approach currently proposed by the PEP. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Mike Graham wrote:
I do, however, regularly write "(a < b) & (b < c)" and hate it; a little observation reveals is it quite terrible.
It might not be the nicest syntax ever, but I still find this quite readable. Of course 'a < b < c' looks nicer, but it's not that big a deal.
Just for the record, NumPy already allows the syntax logical_and(a < b, b < c) Cheers, Sven

Raymond Hettinger wrote:
Extended slicing and ellipsis tricks weren't so bad because they were easily ignored by general users.
Don't forget complex numbers, added simultaneously, meshing very well, and not deserving the name of "trick" imo. -- BB

On Oct 12, 2011, at 5:39 PM, Nick Coghlan wrote:
Have you considered that what-is-good-for-numpy isn't necessarily good for Python as a whole? Extended slicing and ellipsis tricks weren't so bad because they were easily ignored by general users. In contrast, rich comparisons have burdened everyone (we've paid a price in many ways). The numeric world really needs more operators than Python provides (a matrix multiplication operator for example), but I don't think Python is better-off by letting those needs leak back into the core language one-at-a-time. Raymond

On Thu, Oct 13, 2011 at 2:43 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
Yeah, I'm still almost entirely negative on PEP 355 (the discussion of it only started up again because I asked Guido if we could kill it off officially rather than leaving it lingering in Open status indefinitely). I just thought the chained comparisons case was worth bringing up, since the PEP doesn't currently mention it and it's quite a subtle distinction that you can overload the binary operators to create a rich comparison operation but this overloading isn't effective in the chained comparison case due to the implicit 'and' underlying that syntax. Overall, PEP 355 still seems to be trying to swat a gnat with a sledgehammer, and that's the perspective of someone that has a long history of trying to take out language gnats with sledgehammers of his own ;) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, Oct 13, 2011 at 12:43 AM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
Have you considered that what-is-good-for-numpy isn't necessarily good for Python as a whole? Extended slicing and ellipsis tricks weren't so bad because they were easily ignored by general users. In contrast, rich comparisons have burdened everyone (we've paid a price in many ways).
Rich comparisons added complication to Python, but was a very worthwhile feature. In addition to numpy, they are used by packages like sqlalchemy and sympy to give a more natural syntax to some operations. The introduction of rich comparisons also included the introduction of Notimplemented (IIRC), which adds even more complication but makes it possible to write more powerful code. __cmp__ also had a somewhat odd (though not unique) API, which I many times saw confuse learners. In any event, I don't think rich comparisons affect most users, who very seldom have an excuse to write a set of comparison operators (using __cmp__ or rich comparisons).
The numeric world really needs more operators than Python provides (a matrix multiplication operator for example), but I don't think Python is better-off by letting those needs leak back into the core language one-at-a-time.
Having used numpy fairly extensively, I disagree. I don't mind having to call a normal function/method like solve, dot, conj, or transpose in circumstances where a language like Matlab would have a dedicated operator. In fact, I could argue for these things that what Python does Python's way is superior to Matlab's in particular, as most of these operators have or are related to problematic features or syntax. I do, however, regularly write "(a < b) & (b < c)" and hate it; a little observation reveals is it quite terrible. That being said, I think the fault might be as much numpy's as anything. An API like b.isbetween(a, c) or even (a < b).logicaland(b < c) would probably be nicer than the current typical solution. Though these fall short of being able to write a < b < c, which would be consistent and obvious, they would perhaps be enough to weaken the idea that a semantic change in Python could be beneficial. I'm still not seeing the great harm this will have on normal Python programmers who don't wish to overload boolean operators. Unlike rich comparisons, which deprecated the standard way to do thins, in this case the developer using Python can do the exact same thing she was doing all along and get the same results. Mike

Not to mention that support of overloadable boolean operations would be very beneficial for some ORMs too. As of now we have the only option to write something like: "ops.And(cond1, ops.Or(cond2, cond3))", or to use operators for bit logic with different precedence. - Yury On 2011-10-19, at 3:41 PM, Mike Graham wrote:

On Oct 19, 2011, at 12:41 PM, Mike Graham wrote:
I'm still not seeing the great harm this will have on normal Python programmers who don't wish to overload boolean operators.
It is harmful. The and/or operators are not currently dependent on the underlying objects. They can be compiled and explained simply in terms of if's. They are control flow, not just logic operators. We explain short circuiting once and everybody gets it. But that changes if short-circuiting only happens with certain inputs. It makes it much harder to look at code and know what does. I'm reminded of the effort to make "is" over-loadable. It finally go shot down because it so profoundly messed with people's understanding of identity and because is would not longer be possible to easily reason about code (i.e. it becomes difficult to assure that simple container code is correct without knowing what kind of objects were going to be stored in the container). In a way, the and/or/not overloading suggestion is worse than rich comparisons because even the simplest "a and b" expression would have to be compiled in a profoundly different way (and the related peephole optimizations would not longer be valid). Everyone would pay the price. Raymond

On Wed, Oct 19, 2011 at 4:25 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote: ..
I would even mention the whitespace overloading "idea": http://www2.research.att.com/~bs/whitespace98.pdf Even in C++, not everything can be overloaded. Some applications are best served by a special purpose language rather than by adding features to a general purpose one.

On Oct 19, 2011, at 3:25 PM, Greg Ewing wrote:
What peephole optimisations are currently applied to boolean expressions?
Here's the comment from Python/peephole.c: /* Simplify conditional jump to conditional jump where the result of the first test implies the success of a similar test or the failure of the opposite test. Arises in code like: "if a and b:" "if a or b:" "a and b or c" "(a and b) and c" x:JUMP_IF_FALSE_OR_POP y y:JUMP_IF_FALSE_OR_POP z --> x:JUMP_IF_FALSE_OR_POP z x:JUMP_IF_FALSE_OR_POP y y:JUMP_IF_TRUE_OR_POP z --> x:POP_JUMP_IF_FALSE y+3 where y+3 is the instruction following the second test. */ Raymond

On 20/10/11 13:16, Raymond Hettinger wrote:
While the existing peephole optimisations wouldn't work as-is, there's no reason that similarly efficient code couldn't be generated, either by peephole or using a different compilation strategy to begin with. There are some comments about this in the draft PEP update that I'll try to get submitted soon. -- Greg

On 19 October 2011 21:25, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
An interesting point is that while the proposal is about overloading the logical operators, many of the arguments in favour are referring to chained comparisons. If the rich comparison mechanisms were somehow extended to cover chained comparisons, would that satisfy people's requirements without needing logical operator overloading? I'm not saying I agree with the idea of overloading chained comparisons either, just wondering if a less ambitious proposal would be of any value. Personally, I have no use for any of this... Paul.

Paul Moore wrote:
Not really -- the matter of chained comparisons was only brought up recently. There's much more behind it than that.
It wouldn't satisfy any of the use cases I had in mind when I wrote the PEP. -- Greg

On Thu, Oct 20, 2011 at 6:25 AM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
Indeed. I actually think adding '&&' and '||' for the binary logical operator purposes described in PEP 355 would be a preferable alternative to messing with the meaning of 'and' and 'or' as flow control expressions. The meaning of chained comparisons could then also be updated accordingly so that "a < b < c" translated to "a < b && b < c" if the result of "a < b" overloaded the logical and operation, but would still short circuit otherwise. I'm not saying I think that's necessarily a *good* idea - I'm just saying I dislike it less than the approach currently proposed by the PEP. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Mike Graham wrote:
I do, however, regularly write "(a < b) & (b < c)" and hate it; a little observation reveals is it quite terrible.
It might not be the nicest syntax ever, but I still find this quite readable. Of course 'a < b < c' looks nicer, but it's not that big a deal.
Just for the record, NumPy already allows the syntax logical_and(a < b, b < c) Cheers, Sven

Raymond Hettinger wrote:
Extended slicing and ellipsis tricks weren't so bad because they were easily ignored by general users.
Don't forget complex numbers, added simultaneously, meshing very well, and not deserving the name of "trick" imo. -- BB
participants (9)
-
Alexander Belopolsky
-
Boris Borcic
-
Greg Ewing
-
Mike Graham
-
Nick Coghlan
-
Paul Moore
-
Raymond Hettinger
-
Sven Marnach
-
Yury Selivanov