Mailman 3 Re: [Python-ideas] PEP 532: A circuit breaking operator and protocol - Python-ideas

Nick Coghlan

November 2016

2:11 p.m.

New subject: PEP 532: A circuit breaking operator and protocol

On 14 November 2016 at 19:01, Ryan Fox <ryan@rcfox.ca> wrote:

(Hi, I'm a first-time poster. I was inspired by Raymond Hettinger's keynote at PyCon CA to look at new PEPs and comment on them. Hopefully, I'm not committing any faux-pas in my post!)

1) I don't think this proposal sufficiently handles falsy values.

What is the expected value of x in the following?

x = exists(0) else 1

Since the PEP specifically states that exists() is to check that the input is not None, as a user, I would expect x == 0. However, from my interpretation, it appears that the value 0 would still be directly evaluated for truthiness and we would get x == 1.

No, the conditional branching would be based on exists.__bool__ (or, in the current working draft, is_not_none.__bool__), and that would be "0 is not None", which would be True and hence short-circuit.

...

I don't fully get what the purpose of __then__ and __else__ are meant to be, but it seems like instead of this:

type(_lhs).__then__(_lhs) if _lhs else type(_lhs).__else__(_lhs, RHS)

you would want:

LHS if type(_lhs).__then__(_lhs) else RHS

`__then__` is responsible for *unwrapping* the original value from the circuit breaker when it short-circuits: it's what allows the overall expression to return "0", even though the truth check is done based on "0 is not None".

...

Where __then__ returns a simple True/False result. (Maybe the name __then__ doesn't make sense in that case.)

We already have a method for that: __bool__. However, it has exactly the problem you describe, which is why "0 or expr" will always short-circuit, and why "(0 is not None) or expr" will return "True".

...

2) My initial reaction was that `else` doesn't belong in an expression, but I guess there's already precedent for that. (I actually wasn't aware of the `x if y else z` expression until I read this PEP!)

I'm already not a fan of the overloading of else in the cases of for/else and try/else. (Are there other uses? It's a hard thing to search on Google...) Now, we're going to have `else`s that aren't anchored to other statements. You've always known that an `else` belonged to the preceding if/for/try at the same level of indentation. (Or in the same expression.) Is there a chance of a missing colon silently changing the meaning of code?

if foo(): a = 1 else bar()

(Probably not...)

No, due to Python's line continuation rules - you'd also need parentheses or a backslash to avoid getting a SyntaxError on the unfinished line. It does create amibiguities around conditional expressions though, hence why that comes up as one of the main pragmatic concerns with the idea.

...

Are two missing spaces too outrageous?

x = yiffoo() else bar()

I'm not 100% sure, but I think that a current parser would see that as a syntax error very early, as opposed to having to wait for it to try to find 'yiffoo' at run-time.

That style of error is already possible with the other keyword based operators: x = notfoo() y = xorfoo() z = yandfoo() As you not, the main defense is that this will usually be a name error, picked up either at runtime or by a static code analyser.

...

3) Towards the end, you propose some very Perl-like syntax:

print(some_expensive_query()) if verbosity > 2

This seems completely unrelated to the rest of the PEP, and will likely invite people to propose an `unless` operator if implemented. (Followed by an `until` statement.) :)

While it does read more naturally like English, it seems superfluous in a programming language.

De Morgan's laws [1] mean that 'and' and 'or' are technically redundant with each other, as given 'not', you can always express one in terms of the other: X and Y --> not ((not X) or (not Y)) X or Y --> not ((not X) and (not Y)) However, writing out the laws like that also makes it clear why they're not redundant in practice: the inverted forms involve double-negatives that make them incredibly hard to read. Those rules impact this PEP by way of the fact that in "LHS if COND else RHS", the "if" and "else" are actually in the same logical relation to each other as "and" and "or" are in "COND and LHS or RHS". Accordingly, if "LHS if COND else RHS" were to be reformulated as a compound instruction built from two binary instructions (akin to the way comparison chaining works) as considered in the "Risks and Concerns" section about the language level inconsistencies that the current draft introduces, then we'd expect De Morgan's laws to hold there as well: Y if X --> not ((not X) else (not Y)) X else Y --> not ((not Y) if (not X)) It hadn't occurred to me to include that observation in the PEP while updating it to switch to that base design, but it really should be there as an additional invariant that well-behaved symmetric circuit breakers should adhere to. [1] https://en.wikipedia.org/wiki/De_Morgan%27s_laws

...

4) The proposal shows how it fixes some common pain points:

value = missing(obj) else obj.field.of.interest value = missing(obj) else obj["field"]["of"]["interest"]

But it doesn't address very similar ones:

missing(obj) else missing(obj.field) else missing(obj.field.of) else obj.field.of.interest obj.get('field', {}).get('of', {}).get('interest')

(The first example shows how it would be handled with the PEP in its current state.)

Maybe these are too far out of scope, I'm not sure. They feel very similar to me though.

The current draft already indicates it doesn't aim to compete with the "?." or "?[]" proposals in PEP 505 (which handle these two cases), and the next draft drops the competition with "??" as well. That way, the proposed circuit breakers for the PEP 505 cases can just be "operator.is_none" and "operator.is_not_none".

...

I hope these are useful comments and not too nit-picky.

They were very helpful, and picked up a key technical point that I'd missed in the revised proposal I'm currently working on. I think I also need to restore a diagram that Mark E. Haase drew for an earlier draft of the PEP that may make it easier for folks to visualise the related control flow. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Reply

Sign in to reply online Use email software

Nick Coghlan

8:18 a.m.

New subject: PEP 532: A circuit breaking operator and protocol

On 15 November 2016 at 17:13, Ryan Fox <ryan@rcfox.ca> wrote:

...

I'm worried that the distinction between `or` and `else` will not be obvious. It seems like `else` will effectively just be `or`, but with more functionality.

The next draft makes that explicit: "and", "or" and PEP 505's "??" would all just be syntactic sugar for "else" combined with particular circuit breakers.

...

I'm also still not convinced about the reasons to avoid implementing this on `or`. I'll address the points from the rationale:

...
defining a shared protocol for both and and or was confusing, as __then__ was the short-circuiting outcome for or , while__else__ was the short-circuiting outcome for and

I wonder: Could the protocol be defined in terms of `or`, with DeMorgan's law applied behind the scenes in the case of `and`?

ie: existing(x) and existing(y) => missing(y) or missing(x)

The next draft reverts to the symmetric API proposal from pre-publication drafts, so this part of the rationale is gone.

...

...
the and and or operators have a long established and stable meaning, so readers would inevitably be surprised if their meaning now became dependent on the type of the left operand. Even new users would be confused by this change due to 25+ years of teaching material that assumes the current well-known semantics for these operators

With basic pass-through implementations for __then__ and __else__ attached to all classes by default, and the existing __bool__, it seems like `or` would continue to function in the same way it currently does.

Except it would be a *lot* slower (as in, an-order-of-magnitude slower, not a-few-percent slower). The forced call to __bool__() in the second example below hints at the likely cost of bypassing the existing optimised fast paths for conditions that produce a boolean result: $ python -m perf timeit -s "lhs = True; rhs = False" "lhs and rhs" ..................... Median +- std dev: 16.6 ns +- 3.3 ns $ python -m perf timeit -s "lhs = True; rhs = False" "lhs.__bool__() and rhs" ..................... Median +- std dev: 113 ns +- 18 ns Accordingly, we want interpreter implementations to be able to readily distinguish between "normal" conditions (which would continue to just be evaluated as boolean values in order to determine which branch to take) and circuit breakers (which want to be able to further influence the result *after* the interpreter has determined which branch to evaluate)

...

There are plenty of current dunder methods that are already redefined in ways that might confuse people: % on strings, set operators, etc.

None of those cases introduced a protocol method into an operation that didn't previously use one - they instead borrowed existing protocol driven operators for their own purposes.

...

...
Python interpreter implementations, including CPython, have taken advantage of the existing semantics of and and or when defining runtime and compile time optimisations, which would all need to be reviewed and potentially discarded if the semantics of those operations changed

I can't really speak to any of this, not being familiar with the internals of any implementation. Though, it might work out that some of the code for handling `and` and `or` could be thrown out, since those operators would be transformed into conditional expressions.

That's exactly the kind of outcome we *don't* want.

...

I very much understand the desire to not break working, optimized implementations. However, this feels a little flimsy as a reason for introducing new syntax.

The language-design-driven reason is that "and" and "or" are terms drawn from boolean logic, and hence can reasonably be expected to implement that. We absolutely *could* say that they don't *necessarily* implement boolean logic anymore, just as mathematical operators don't necessarily represent the traditional arithmetic operations, but I'd personally prefer the status quo to that possible outcome. The first draft of PEP 532 *did* propose doing things that way, though: https://github.com/python/peps/commit/3378b942747604be737eb627df085979ff61b6... I never posted that version here, as I didn't really like it myself, and had in fact already rewritten it to the current proposal by the time I merged it into the main PEPs repo: https://github.com/python/peps/commit/8f095cf8c0ccd4bf770e933a21e04b37afc53c... :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Reply

Sign in to reply online Use email software

Nick Coghlan

November 2016

2:11 p.m.

New subject: PEP 532: A circuit breaking operator and protocol

On 14 November 2016 at 19:01, Ryan Fox <ryan@rcfox.ca> wrote:

...

(Hi, I'm a first-time poster. I was inspired by Raymond Hettinger's keynote at PyCon CA to look at new PEPs and comment on them. Hopefully, I'm not committing any faux-pas in my post!)

1) I don't think this proposal sufficiently handles falsy values.

What is the expected value of x in the following?

x = exists(0) else 1

Since the PEP specifically states that exists() is to check that the input is not None, as a user, I would expect x == 0. However, from my interpretation, it appears that the value 0 would still be directly evaluated for truthiness and we would get x == 1.

No, the conditional branching would be based on exists.__bool__ (or, in the current working draft, is_not_none.__bool__), and that would be "0 is not None", which would be True and hence short-circuit.

...

I don't fully get what the purpose of __then__ and __else__ are meant to be, but it seems like instead of this:

type(_lhs).__then__(_lhs) if _lhs else type(_lhs).__else__(_lhs, RHS)

you would want:

LHS if type(_lhs).__then__(_lhs) else RHS

`__then__` is responsible for *unwrapping* the original value from the circuit breaker when it short-circuits: it's what allows the overall expression to return "0", even though the truth check is done based on "0 is not None".

...

Where __then__ returns a simple True/False result. (Maybe the name __then__ doesn't make sense in that case.)

We already have a method for that: __bool__. However, it has exactly the problem you describe, which is why "0 or expr" will always short-circuit, and why "(0 is not None) or expr" will return "True".

...

2) My initial reaction was that `else` doesn't belong in an expression, but I guess there's already precedent for that. (I actually wasn't aware of the `x if y else z` expression until I read this PEP!)

I'm already not a fan of the overloading of else in the cases of for/else and try/else. (Are there other uses? It's a hard thing to search on Google...) Now, we're going to have `else`s that aren't anchored to other statements. You've always known that an `else` belonged to the preceding if/for/try at the same level of indentation. (Or in the same expression.) Is there a chance of a missing colon silently changing the meaning of code?

if foo(): a = 1 else bar()

(Probably not...)

No, due to Python's line continuation rules - you'd also need parentheses or a backslash to avoid getting a SyntaxError on the unfinished line. It does create amibiguities around conditional expressions though, hence why that comes up as one of the main pragmatic concerns with the idea.

...

Are two missing spaces too outrageous?

x = yiffoo() else bar()

I'm not 100% sure, but I think that a current parser would see that as a syntax error very early, as opposed to having to wait for it to try to find 'yiffoo' at run-time.

That style of error is already possible with the other keyword based operators: x = notfoo() y = xorfoo() z = yandfoo() As you not, the main defense is that this will usually be a name error, picked up either at runtime or by a static code analyser.

...

3) Towards the end, you propose some very Perl-like syntax:

print(some_expensive_query()) if verbosity > 2

This seems completely unrelated to the rest of the PEP, and will likely invite people to propose an `unless` operator if implemented. (Followed by an `until` statement.) :)

While it does read more naturally like English, it seems superfluous in a programming language.

De Morgan's laws [1] mean that 'and' and 'or' are technically redundant with each other, as given 'not', you can always express one in terms of the other: X and Y --> not ((not X) or (not Y)) X or Y --> not ((not X) and (not Y)) However, writing out the laws like that also makes it clear why they're not redundant in practice: the inverted forms involve double-negatives that make them incredibly hard to read. Those rules impact this PEP by way of the fact that in "LHS if COND else RHS", the "if" and "else" are actually in the same logical relation to each other as "and" and "or" are in "COND and LHS or RHS". Accordingly, if "LHS if COND else RHS" were to be reformulated as a compound instruction built from two binary instructions (akin to the way comparison chaining works) as considered in the "Risks and Concerns" section about the language level inconsistencies that the current draft introduces, then we'd expect De Morgan's laws to hold there as well: Y if X --> not ((not X) else (not Y)) X else Y --> not ((not Y) if (not X)) It hadn't occurred to me to include that observation in the PEP while updating it to switch to that base design, but it really should be there as an additional invariant that well-behaved symmetric circuit breakers should adhere to. [1] https://en.wikipedia.org/wiki/De_Morgan%27s_laws

...

4) The proposal shows how it fixes some common pain points:

value = missing(obj) else obj.field.of.interest value = missing(obj) else obj["field"]["of"]["interest"]

But it doesn't address very similar ones:

missing(obj) else missing(obj.field) else missing(obj.field.of) else obj.field.of.interest obj.get('field', {}).get('of', {}).get('interest')

(The first example shows how it would be handled with the PEP in its current state.)

Maybe these are too far out of scope, I'm not sure. They feel very similar to me though.

The current draft already indicates it doesn't aim to compete with the "?." or "?[]" proposals in PEP 505 (which handle these two cases), and the next draft drops the competition with "??" as well. That way, the proposed circuit breakers for the PEP 505 cases can just be "operator.is_none" and "operator.is_not_none".

...

I hope these are useful comments and not too nit-picky.

They were very helpful, and picked up a key technical point that I'd missed in the revised proposal I'm currently working on. I think I also need to restore a diagram that Mark E. Haase drew for an earlier draft of the PEP that may make it easier for folks to visualise the related control flow. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Reply

Sign in to reply online Use email software

Nick Coghlan

8:18 a.m.

New subject: PEP 532: A circuit breaking operator and protocol

On 15 November 2016 at 17:13, Ryan Fox <ryan@rcfox.ca> wrote:

...

I'm worried that the distinction between `or` and `else` will not be obvious. It seems like `else` will effectively just be `or`, but with more functionality.

The next draft makes that explicit: "and", "or" and PEP 505's "??" would all just be syntactic sugar for "else" combined with particular circuit breakers.

...

I'm also still not convinced about the reasons to avoid implementing this on `or`. I'll address the points from the rationale:

...
defining a shared protocol for both and and or was confusing, as __then__ was the short-circuiting outcome for or , while__else__ was the short-circuiting outcome for and

I wonder: Could the protocol be defined in terms of `or`, with DeMorgan's law applied behind the scenes in the case of `and`?

ie: existing(x) and existing(y) => missing(y) or missing(x)

The next draft reverts to the symmetric API proposal from pre-publication drafts, so this part of the rationale is gone.

...

...
the and and or operators have a long established and stable meaning, so readers would inevitably be surprised if their meaning now became dependent on the type of the left operand. Even new users would be confused by this change due to 25+ years of teaching material that assumes the current well-known semantics for these operators

With basic pass-through implementations for __then__ and __else__ attached to all classes by default, and the existing __bool__, it seems like `or` would continue to function in the same way it currently does.

Except it would be a *lot* slower (as in, an-order-of-magnitude slower, not a-few-percent slower). The forced call to __bool__() in the second example below hints at the likely cost of bypassing the existing optimised fast paths for conditions that produce a boolean result: $ python -m perf timeit -s "lhs = True; rhs = False" "lhs and rhs" ..................... Median +- std dev: 16.6 ns +- 3.3 ns $ python -m perf timeit -s "lhs = True; rhs = False" "lhs.__bool__() and rhs" ..................... Median +- std dev: 113 ns +- 18 ns Accordingly, we want interpreter implementations to be able to readily distinguish between "normal" conditions (which would continue to just be evaluated as boolean values in order to determine which branch to take) and circuit breakers (which want to be able to further influence the result *after* the interpreter has determined which branch to evaluate)

...

There are plenty of current dunder methods that are already redefined in ways that might confuse people: % on strings, set operators, etc.

None of those cases introduced a protocol method into an operation that didn't previously use one - they instead borrowed existing protocol driven operators for their own purposes.

...

...
Python interpreter implementations, including CPython, have taken advantage of the existing semantics of and and or when defining runtime and compile time optimisations, which would all need to be reviewed and potentially discarded if the semantics of those operations changed

I can't really speak to any of this, not being familiar with the internals of any implementation. Though, it might work out that some of the code for handling `and` and `or` could be thrown out, since those operators would be transformed into conditional expressions.

That's exactly the kind of outcome we *don't* want.

...

I very much understand the desire to not break working, optimized implementations. However, this feels a little flimsy as a reason for introducing new syntax.

The language-design-driven reason is that "and" and "or" are terms drawn from boolean logic, and hence can reasonably be expected to implement that. We absolutely *could* say that they don't *necessarily* implement boolean logic anymore, just as mathematical operators don't necessarily represent the traditional arithmetic operations, but I'd personally prefer the status quo to that possible outcome. The first draft of PEP 532 *did* propose doing things that way, though: https://github.com/python/peps/commit/3378b942747604be737eb627df085979ff61b6... I never posted that version here, as I didn't really like it myself, and had in fact already rewritten it to the current proposal by the time I merged it into the main PEPs repo: https://github.com/python/peps/commit/8f095cf8c0ccd4bf770e933a21e04b37afc53c... :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Reply

Sign in to reply online Use email software

Re: [Python-ideas] PEP 532: A circuit breaking operator and protocol

Ryan Fox

Nick Coghlan

Ryan Fox

Nick Coghlan

Nick Coghlan

Ryan Fox

Nick Coghlan

tags

participants (2)