Wither PEP 335 (Overloadable Boolean Operators)?
While reviewing PEPs, I stumbled over PEP 335 ( Overloadable Boolean Operators) by Greg Ewing. I am of two minds of this -- on the one hand, it's been a long time without any working code or anything. OTOH it might be quite useful to e.g. numpy folks. It is time to reject it due to lack of interest, or revive it! -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
While reviewing PEPs, I stumbled over PEP 335 ( Overloadable Boolean Operators) by Greg Ewing. I am of two minds of this -- on the one hand, it's been a long time without any working code or anything. OTOH it might be quite useful to e.g. numpy folks.
This kind of feature would also be useful for ORMs, to be able to map boolean operators to SQL. Johan
"Guido van Rossum" <guido@python.org> wrote in message news:ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com... | While reviewing PEPs, I stumbled over PEP 335 ( Overloadable Boolean | Operators) by Greg Ewing. I am of two minds of this -- on the one | hand, it's been a long time without any working code or anything. OTOH | it might be quite useful to e.g. numpy folks. | | It is time to reject it due to lack of interest, or revive it! Rejection would currently be fine with me, so I do not intend this to indicate 'revived interest'. But having looked at the PEP now, I have some comments in case anyone else has such. Rationale: if the only extra time without the overloads is the check of Py_TPFLAGS_HAVE_BOOLEAN_OVERLOAD then I suppose it might be 'not appreciable', but in my ignorance, I would be more persuaded by timing data ;-) Motivation: another 'workaround' is the one used in symbolic logic, dating back to Boole himself. '+' for 'or' and '*' for 'and'. But these are also needed as in in your motivating cases. A counter-motivation is that this proposal makes it even harder to reason about the behavior of Python programs. It adds one more caveat to stick in the back of ones mind. Other overloads do the same, but to me, overloading logic cuts a bit deeper. Special Methods: the proposal strikes me as clever but excessively baroque. In the absence of justification for the complexities, here is a *much* simpler version. Delete special casing of NonImplemented. __not__: substitute for default semantics when present. Delete NeedOtherOperand (where would it even live?). The current spelling is True for and and False for or, as with standard semantics. A class that does not want short circuiting, as in your motivating cases, simply defines __and1__ or __or1__ to return True or False. If the return value of __xxx1__ is not True/False, then it is the result. (I believe this matches your spec.) So the LOGICAL_XXX_1 bytecodes check for True/False identity without calling bool() on the result. Delete the reverse methods. They are only needed for mixed-type operations, like scaler*matrix. But such seems senseless here. In any case, they are not needed for any of your motivating applications, which would define both methods without mixing. If the second arg is evaluated, the result is __x2__(arg1,arg2) if defined, else arg2 as usual. Delete the 'As a special case' sentence. Type Slots: someone else can decide if a new flag and 5 new slots are a significant price. Python/C API: 5 more to learn or ignore, but that does not affect me. I do not understand what they would do or how they relate to the byte codes. Other Implementations: any possible problems? (I have no idea.) Terry Jan Reedy
On 5/18/07, Guido van Rossum <guido@python.org> wrote:
While reviewing PEPs, I stumbled over PEP 335 ( Overloadable Boolean Operators) by Greg Ewing.
-1. "and" and "or" affect the flow of control. It's a matter of taste, but I feel the benefit is too small here to add another flow-control quirk. I like that part of the language to be simple. Anyway, if this *is* done, logically it should cover "(... if ... else ...)" as well. Same use cases. -j
On 5/18/07, Guido van Rossum <guido@python.org> wrote:
While reviewing PEPs, I stumbled over PEP 335 ( Overloadable Boolean Operators) by Greg Ewing. I am of two minds of this -- on the one hand, it's been a long time without any working code or anything. OTOH it might be quite useful to e.g. numpy folks.
It is time to reject it due to lack of interest, or revive it!
Last call for discussion! I'm tempted to reject this -- the ability to generate optimized code based on the shortcut semantics of and/or is pretty important to me. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
Last call for discussion! I'm tempted to reject this -- the ability to generate optimized code based on the shortcut semantics of and/or is pretty important to me.
Please don't be hasty. I've had to think about this issue a bit. The conclusion I've come to is that there may be a small loss in the theoretical amount of optimization opportunity available, but not much. Furthermore, if you take into account some other improvements that can be made (which I'll explain below) the result is actually *better* than what 2.5 currently generates. For example, Python 2.5 currently compiles if a and b: <stats> into <evaluate a> JUMP_IF_FALSE L1 POP_TOP <evaluate b> JUMP_IF_FALSE L1 POP_TOP <stats> JUMP_FORWARD L2 L1: 15 POP_TOP L2: Under my PEP, without any other changes, this would become <evaluate a> LOGICAL_AND_1 L1 <evaluate b> LOGICAL_AND_2 L1: JUMP_IF_FALSE L2 POP_TOP <stats> JUMP_FORWARD L3 L2: 15 POP_TOP L3: The fastest path through this involves executing one extra bytecode. However, since we're not using JUMP_IF_FALSE to do the short-circuiting any more, there's no need for it to leave its operand on the stack. So let's redefine it and change its name to POP_JUMP_IF_FALSE. This allows us to get rid of all the POP_TOPs, plus the jump at the end of the statement body. Now we have <evaluate a> LOGICAL_AND_1 L1 <evaluate b> LOGICAL_AND_2 L1: POP_JUMP_IF_FALSE L2 <stats> L2: The fastest path through this executes one *less* bytecode than in the current 2.5-generated code. Also, any path that ends up executing the body benefits from the lack of a jump at the end. The same benefits also result when the boolean expression is more complex, e.g. if a or b and c: <stats> becomes <evaluate a> LOGICAL_OR_1 L1 <evaluate b> LOGICAL_AND_1 L2 <evaluate c> LOGICAL_AND_2 L2: LOGICAL_OR_2 L1: POP_JUMP_IF_FALSE L3 <stats> L3: which contains 3 fewer instructions overall than the corresponding 2.5-generated code. So I contend that optimization is not an argument for rejecting this PEP, and may even be one for accepting it. -- Greg
On 5/24/07, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Guido van Rossum wrote:
Last call for discussion! I'm tempted to reject this -- the ability to generate optimized code based on the shortcut semantics of and/or is pretty important to me.
Please don't be hasty. I've had to think about this issue a bit.
The conclusion I've come to is that there may be a small loss in the theoretical amount of optimization opportunity available, but not much. Furthermore, if you take into account some other improvements that can be made (which I'll explain below) the result is actually *better* than what 2.5 currently generates.
For example, Python 2.5 currently compiles
if a and b: <stats>
into
<evaluate a> JUMP_IF_FALSE L1 POP_TOP <evaluate b> JUMP_IF_FALSE L1 POP_TOP <stats> JUMP_FORWARD L2 L1: 15 POP_TOP L2:
Under my PEP, without any other changes, this would become
<evaluate a> LOGICAL_AND_1 L1 <evaluate b> LOGICAL_AND_2 L1: JUMP_IF_FALSE L2 POP_TOP <stats> JUMP_FORWARD L3 L2: 15 POP_TOP L3:
The fastest path through this involves executing one extra bytecode. However, since we're not using JUMP_IF_FALSE to do the short-circuiting any more, there's no need for it to leave its operand on the stack. So let's redefine it and change its name to POP_JUMP_IF_FALSE. This allows us to get rid of all the POP_TOPs, plus the jump at the end of the statement body. Now we have
<evaluate a> LOGICAL_AND_1 L1 <evaluate b> LOGICAL_AND_2 L1: POP_JUMP_IF_FALSE L2 <stats> L2:
The fastest path through this executes one *less* bytecode than in the current 2.5-generated code. Also, any path that ends up executing the body benefits from the lack of a jump at the end.
The same benefits also result when the boolean expression is more complex, e.g.
if a or b and c: <stats>
becomes
<evaluate a> LOGICAL_OR_1 L1 <evaluate b> LOGICAL_AND_1 L2 <evaluate c> LOGICAL_AND_2 L2: LOGICAL_OR_2 L1: POP_JUMP_IF_FALSE L3 <stats> L3:
which contains 3 fewer instructions overall than the corresponding 2.5-generated code.
So I contend that optimization is not an argument for rejecting this PEP, and may even be one for accepting it.
Do you have an implementation available to measure this? In most cases the cost is not in the number of bytecode instructions executed but in the total amount of work. Two cheap bytecodes might well be cheaper than one expensive one. However, I'm happy to keep your PEP open until you have code that we can measure. (However, adding additional optimizations elsewhere to make up for the loss wouldn't be fair -- we would have to compare with a 2.5 or trunk (2.6) interpreter with the same additional optimizations added.) -- --Guido van Rossum (home page: http://www.python.org/~guido/)
From a user's POV, I'm +1 on having overloadable boolean functions. In many cases I had to resort to overload add or neg instead of and & not, I foresee a lot of cases where the and overload could be used to join objects which represent constraints. Overloadable boolean operators could also be used to implement other types of logic (eg: fuzzy logic). Constraining them to just primitive binary operations in my view will be delimiting for a myriad of use cases.
Sure, in some cases, one could overload the neg operator instead of the not but semantically they have different meanings. On 5/25/07, Guido van Rossum <guido@python.org> wrote:
On 5/24/07, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Guido van Rossum wrote:
Last call for discussion! I'm tempted to reject this -- the ability to generate optimized code based on the shortcut semantics of and/or is pretty important to me.
Please don't be hasty. I've had to think about this issue a bit.
The conclusion I've come to is that there may be a small loss in the theoretical amount of optimization opportunity available, but not much. Furthermore, if you take into account some other improvements that can be made (which I'll explain below) the result is actually *better* than what 2.5 currently generates.
For example, Python 2.5 currently compiles
if a and b: <stats>
into
<evaluate a> JUMP_IF_FALSE L1 POP_TOP <evaluate b> JUMP_IF_FALSE L1 POP_TOP <stats> JUMP_FORWARD L2 L1: 15 POP_TOP L2:
Under my PEP, without any other changes, this would become
<evaluate a> LOGICAL_AND_1 L1 <evaluate b> LOGICAL_AND_2 L1: JUMP_IF_FALSE L2 POP_TOP <stats> JUMP_FORWARD L3 L2: 15 POP_TOP L3:
The fastest path through this involves executing one extra bytecode. However, since we're not using JUMP_IF_FALSE to do the short-circuiting any more, there's no need for it to leave its operand on the stack. So let's redefine it and change its name to POP_JUMP_IF_FALSE. This allows us to get rid of all the POP_TOPs, plus the jump at the end of the statement body. Now we have
<evaluate a> LOGICAL_AND_1 L1 <evaluate b> LOGICAL_AND_2 L1: POP_JUMP_IF_FALSE L2 <stats> L2:
The fastest path through this executes one *less* bytecode than in the current 2.5-generated code. Also, any path that ends up executing the body benefits from the lack of a jump at the end.
The same benefits also result when the boolean expression is more complex, e.g.
if a or b and c: <stats>
becomes
<evaluate a> LOGICAL_OR_1 L1 <evaluate b> LOGICAL_AND_1 L2 <evaluate c> LOGICAL_AND_2 L2: LOGICAL_OR_2 L1: POP_JUMP_IF_FALSE L3 <stats> L3:
which contains 3 fewer instructions overall than the corresponding 2.5-generated code.
So I contend that optimization is not an argument for rejecting this PEP, and may even be one for accepting it.
Do you have an implementation available to measure this? In most cases the cost is not in the number of bytecode instructions executed but in the total amount of work. Two cheap bytecodes might well be cheaper than one expensive one.
However, I'm happy to keep your PEP open until you have code that we can measure. (However, adding additional optimizations elsewhere to make up for the loss wouldn't be fair -- we would have to compare with a 2.5 or trunk (2.6) interpreter with the same additional optimizations added.)
-- --Guido van Rossum (home page: http://www.python.org/~guido/) _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/nevillegrech%40gmail.com
-- Regards, Neville Grech
participants (6)
-
Greg Ewing
-
Guido van Rossum
-
Jason Orendorff
-
Johan Dahlin
-
Neville Grech Neville Grech
-
Terry Reedy