PEP 532: A circuit breaking operator and protocol

Hi folks, As promised, here's a follow-up to the withdrawn PEP 531 that focuses on coming up with a common protocol driven solution to conditional evaluation of subexpressions that also addresses the element-wise comparison chaining problem that Guido noted when rejecting PEP 335. I quite like how it came out, but see the "Risks & Concerns" section for a discussion of an internal inconsistency it would introduce into the language if accepted as currently written, and the potentially far-reaching consequences actually resolving that inconsistency might have on the way people write their Python code (if the PEP was subsequently approved). Regards, Nick. ================================= PEP: 532 Title: A circuit breaking operator and protocol Version: $Revision$ Last-Modified: $Date$ Author: Nick Coghlan <ncoghlan@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 30-Oct-2016 Python-Version: 3.7 Post-History: 5-Nov-2016 Abstract ======== Inspired by PEP 335, PEP 505, PEP 531, and the related discussions, this PEP proposes the addition of a new protocol-driven circuit breaking operator to Python that allows the left operand to decide whether or not the expression should short circuit and return a result immediately, or else continue on with evaluation of the right operand:: exists(foo) else bar missing(foo) else foo.bar() These two expressions can be read as: * "the expression result is 'foo' if it exists, otherwise it is 'bar'" * "the expression result is 'foo' if it is missing, otherwise it is 'foo.bar()'" Execution of these expressions relies on a new circuit breaking protocol that implicitly avoids repeated evaluation of the left operand while letting that operand fully control the result of the expression, regardless of whether it skips evaluation of the right operand or not:: _lhs = LHS type(_lhs).__then__(_lhs) if _lhs else type(_lhs).__else__(_lhs, RHS) To properly support logical negation of circuit breakers, a new ``__not__`` protocol method would also be introduced allowing objects to control the result of ``not obj`` expressions. As shown in the basic example above, the PEP further proposes the addition of builtin ``exists`` and ``missing`` circuit breakers that provide conditional branching based on whether or not an object is ``None``, but return the original object rather than the existence checking wrapper when the expression evaluation short circuits. In addition to being usable as simple boolean operators (e.g. as in ``assert all(exists, items)`` or ``if any(missing, items):``), these circuit breakers will allow existence checking fallback operations (aka None-coalescing operations) to be written as:: value = exists(expr1) else exists(expr2) else expr3 and existence checking precondition operations (aka None-propagating or None-severing operations) to be written as:: value = missing(obj) else obj.field.of.interest value = missing(obj) else obj["field"]["of"]["interest"] A change to the definition of chained comparisons is also proposed, where the comparison chaining will be updated to use the circuit breaking operator rather than the logical disjunction (``and``) operator if the left hand comparison returns a circuit breaker as its result. While there are some practical complexities arising from the current handling of single-valued arrays in NumPy, this change should be sufficient to allow elementwise chained comparison operations for matrices, where the result is a matrix of boolean values, rather than tautologically returning ``True`` or raising ``ValueError``. Relationship with other PEPs ============================ This PEP is a direct successor to PEP 531, replacing the existence checking protocol and the new ``?then`` and ``?else`` syntactic operators defined there with a single protocol driven ``else`` operator and adjustments to the ``not`` operator. The existence checking use cases are taken from that PEP. It is also a direct successor to PEP 335, which proposed the ability to overload the ``and`` and ``or`` operators directly, with the ability to overload the semantics of comparison chaining being one of the consequences of that change. The proposal in this PEP to instead handle the element-wise comparison use case by changing the semantic definition of comparison chaining is drawn from Guido's rejection of PEP 335. This PEP competes with the dedicated null-coalescing operator in PEP 505, proposing that improved support for null-coalescing operations be offered through a more general protocol-driven short circuiting operator and related builtins, rather than through a dedicated single-purpose operator. It doesn't compete with PEP 505's proposed shorthands for existence checking attribute access and subscripting, but instead offers an alternative underlying semantic framework for defining them: * ``EXPR?.attr`` would be syntactic sugar for ``missing(EXPR) else EXPR.attr`` * ``EXPR?[key]`` would be syntactic sugar for ``missing(EXPR) else EXPR[key]`` In both cases, the dedicated syntactic form could be optimised to avoid actually creating the circuit breaker instance. Specification ============= The circuit breaking operator (``else``) ---------------------------------------- Circuit breaking expressions would be written using ``else`` as a new binary operator, akin to the existing ``and`` and ``or`` logical operators:: LHS else RHS Ignoring the hidden variable assignment, this is semantically equivalent to:: _lhs = LHS type(_lhs).__then__(_lhs) if _lhs else type(_lhs).__else__(_lhs, RHS) The key difference relative to the existing ``or`` operator is that the value determining which branch of the conditional expression gets executed *also* gets a chance to postprocess the results of the expressions on each of the branches. As part of the short-circuiting behaviour, interpreter implementations are expected to access only the protocol method needed for the branch that is actually executed, but it is still recommended that circuit breaker authors that always return ``True`` or always return ``False`` from ``__bool__`` explicitly raise ``NotImplementedError`` with a suitable message from branch methods that are never expected to be executed (see the comparison chaining use case in the Rationale section below for an example of that). It is proposed that the ``else`` operator use a new precedence level that binds less tightly than the ``or`` operator by adjusting the relevant line in Python's grammar from the current:: test: or_test ['if' or_test 'else' test] | lambdef to instead be:: test: else_test ['if' or_test 'else' test] | lambdef else_test: or_test ['else' test] The definition of ``test_nocond`` would remain unchanged, so circuit breaking expressions would require parentheses when used in the ``if`` clause of comprehensions and generator expressions just as conditional expressions themselves do. This grammar definition means precedence/associativity in the otherwise ambiguous case of ``expr1 if cond else expr2 else epxr3`` resolves as ``(expr1 if cond else expr2) else epxr3``. A guideline will also be added to PEP 8 to say "don't do that", as such a construct will be inherently confusing for readers, regardless of how the interpreter executes it. Overloading logical inversion (``not``) --------------------------------------- Any circuit breaker definition will have a logical inverse that is still a circuit breaker, but inverts the answer as to whether or not to short circuit the expression evaluation. For example, the ``exists`` and ``missing`` circuit breakers proposed in this PEP are each other's logical inverse. A new protocol method, ``__not__(self)``, will be introduced to permit circuit breakers and other types to override ``not`` expressions to return their logical inverse rather than a coerced boolean result. To preserve the semantics of existing language optimisations, ``__not__`` implementations will be obliged to respect the following invariant:: assert not bool(obj) == bool(not obj) Chained comparisons ------------------- A chained comparison like ``0 < x < 10`` written as:: LEFT_BOUND left_op EXPR right_op RIGHT_BOUND is currently roughly semantically equivalent to:: _expr = EXPR _lhs_result = LEFT_BOUND left_op _expr _expr_result = _lhs_result and (_expr right_op RIGHT_BOUND) This PEP proposes that this be changed to explicitly check if the left comparison returns a circuit breaker, and if so, use ``else`` rather than ``and`` to implement the comparison chaining:: _expr = EXPR _lhs_result = LEFT_BOUND left_op _expr if hasattr(type(_lhs_result), "__then__"): _expr_result = _lhs_result else (_expr right_op RIGHT_BOUND) else: _expr_result = _lhs_result and (_expr right_op RIGHT_BOUND) This allows types like NumPy arrays to control the behaviour of chained comparisons by returning circuit breakers from comparison operations. Existence checking comparisons ------------------------------ Two new builtins implementing the new protocol are proposed to encapsulate the notion of "existence checking": seeing if a value is ``None`` and either falling back to an alternative value (an operation known as "None-coalescing") or passing it through as the result of the overall expression (an operation known as "None-severing" or "None-propagating"). These builtins would be defined as follows:: class CircuitBreaker: """Base circuit breaker type (available as types.CircuitBreaker)""" def __init__(self, value, condition, inverse_type): self.value = value self._condition = condition self._inverse_type = inverse_type def __bool__(self): return self._condition def __not__(self): return self._inverse_type(self.value) def __then__(self): return self.value def __else__(self, other): if other is self: return self.value return other class exists(types.CircuitBreaker): """Circuit breaker for 'EXPR is not None' checks""" def __init__(self, value): super().__init__(value, value is not None, missing) class missing(types.CircuitBreaker): """Circuit breaker for 'EXPR is None' checks""" def __init__(self, value): super().__init__(value, value is None, exists) Aside from changing the definition of ``__bool__`` to be based on ``is not None`` rather than normal truth checking, the key characteristic of ``exists`` is that when it is used as a circuit breaker, it is *ephemeral*: when it is told that short circuiting has taken place, it returns the original value, rather than the existence checking wrapper. ``missing`` is defined as the logically inverted counterpart of ``exists``: ``not exists(obj)`` is semantically equivalent to ``missing(obj)``. The ``__else__`` implementations for both builtin circuit breakers are defined such that the wrapper will always be removed even if you explicitly pass the circuit breaker to both sides of the ``else`` expression:: breaker = exists(foo) assert (breaker else breaker) is foo breaker = missing(foo) assert (breaker else breaker) is foo Other conditional constructs ---------------------------- No changes are proposed to if statements, while statements, conditional expressions, comprehensions, or generator expressions, as the boolean clauses they contain are already used for control flow purposes. However, it's worth noting that while such proposals are outside the scope of this PEP, the circuit breaking protocol defined here would be sufficient to support constructs like:: while exists(dynamic_query()) as result: ... # Code using result and: if exists(re.search(pattern, text)) as match: ... # Code using match Leaving the door open to such a future extension is the main reason for recommending that circuit breaker implementations handle the ``self is other`` case in ``__else__`` implementations the same way as they handle the short-circuiting behaviour in ``__then__``. Style guide recommendations --------------------------- The following additions to PEP 8 are proposed in relation to the new features introduced by this PEP: * In the absence of other considerations, prefer the use of the builtin circuit breakers ``exists`` and ``missing`` over the corresponding conditional expressions * Do not combine conditional expressions (``if-else``) and circuit breaking expressions (the ``else`` operator) in a single expression - use one or the other depending on the situation, but not both. Rationale ========= Adding a new operator --------------------- Similar to PEP 335, early drafts of this PEP focused on making the existing ``and`` and ``or`` operators less rigid in their interpretation, rather than proposing new operators. However, this proved to be problematic for a few reasons: * defining a shared protocol for both ``and`` and ``or`` was confusing, as ``__then__`` was the short-circuiting outcome for ``or``, while ``__else__`` was the short-circuiting outcome for ``and`` * the ``and`` and ``or`` operators have a long established and stable meaning, so readers would inevitably be surprised if their meaning now became dependent on the type of the left operand. Even new users would be confused by this change due to 25+ years of teaching material that assumes the current well-known semantics for these operators * Python interpreter implementations, including CPython, have taken advantage of the existing semantics of ``and`` and ``or`` when defining runtime and compile time optimisations, which would all need to be reviewed and potentially discarded if the semantics of those operations changed Proposing a single new operator instead resolves all of those issues - ``__then__`` always indicates short circuiting, ``__else__`` only indicates "short circuiting" if the circuit breaker itself is also passed in as the right operand, and the semantics of ``and`` and ``or`` remain entirely unchanged. While the semantics of the unary ``not`` operator do change, the invariant required of ``__not__`` implementations means that existing expression optimisations in boolean contexts will remain valid. As a result of that design simplification, the new protocol and operator would even allow us to expose ``operator.true`` and ``operator.false`` as circuit breaker definitions if we chose to do so:: class true(types.CircuitBreaker): """Circuit breaker for 'bool(EXPR)' checks""" def __init__(self, value): super().__init__(value, bool(value), when_false) class false(types.CircuitBreaker): """Circuit breaker for 'not bool(EXPR)' checks""" def __init__(self, value): super().__init__(value, not bool(value), when_true) Given those circuit breakers: * ``LHS or RHS`` would be roughly ``operator.true(LHS) else RHS`` * ``LHS and RHS`` would be roughly ``operator.false(LHS) else RHS`` Naming the operator and protocol -------------------------------- The names "circuit breaking operator", "circuit breaking protocol" and "circuit breaker" are all inspired by the phrase "short circuiting operator": the general language design term for operators that only conditionally evaluate their right operand. The electrical analogy is that circuit breakers in Python detect and handle short circuits in expressions before they trigger any exceptions similar to the way that circuit breakers detect and handle short circuits in electrical systems before they damage any equipment or harm any humans. The Python level analogy is that just as a ``break`` statement lets you terminate a loop before it reaches its natural conclusion, a circuit breaking expression lets you terminate evaluation of the expression and produce a result immediately. Using an existing keyword ------------------------- Using an existing keyword has the benefit of allowing the new expression to be introduced without a ``__future__`` statement. ``else`` is semantically appropriate for the proposed new protocol, and the only syntactic ambiguity introduced arises when the new operator is combined with the explicit ``if-else`` conditional expression syntax. Element-wise chained comparisons -------------------------------- In ultimately rejecting PEP 335, Guido van Rossum noted [1_]: The NumPy folks brought up a somewhat separate issue: for them, the most common use case is chained comparisons (e.g. A < B < C). To understand this observation, we first need to look at how comparisons work with NumPy arrays:: >>> import numpy as np >>> increasing = np.arange(5) >>> increasing array([0, 1, 2, 3, 4]) >>> decreasing = np.arange(4, -1, -1) >>> decreasing array([4, 3, 2, 1, 0]) >>> increasing < decreasing array([ True, True, False, False, False], dtype=bool) Here we see that NumPy array comparisons are element-wise by default, comparing each element in the lefthand array to the corresponding element in the righthand array, and producing a matrix of boolean results. If either side of the comparison is a scalar value, then it is broadcast across the array and compared to each individual element:: >>> 0 < increasing array([False, True, True, True, True], dtype=bool) >>> increasing < 4 array([ True, True, True, True, False], dtype=bool) However, this broadcasting idiom breaks down if we attempt to use chained comparisons:: >>> 0 < increasing < 4 Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() The problem is that internally, Python implicitly expands this chained comparison into the form:: >>> 0 < increasing and increasing < 4 Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() And NumPy only permits implicit coercion to a boolean value for single-element arrays where ``a.any()`` and ``a.all()`` can be assured of having the same result:: >>> np.array([False]) and np.array([False]) array([False], dtype=bool) >>> np.array([False]) and np.array([True]) array([False], dtype=bool) >>> np.array([True]) and np.array([False]) array([False], dtype=bool) >>> np.array([True]) and np.array([True]) array([ True], dtype=bool) The proposal in this PEP would allow this situation to be changed by updating the definition of element-wise comparison operations in NumPy to return a dedicated subclass that implements the new circuit breaking protocol and also changes the result array's interpretation in a boolean context to always return ``False`` and hence never trigger the short-circuiting behaviour:: class ComparisonResultArray(np.ndarray): def __bool__(self): return False def _raise_NotImplementedError(self): msg = ("Comparison array truth values are ambiguous outside " "chained comparisons. Use a.any() or a.all()") raise NotImplementedError(msg) def __not__(self): self._raise_NotImplementedError() def __then__(self): self._raise_NotImplementedError() def __else__(self, other): return np.logical_and(self, other.view(ComparisonResultArray)) With this change, the chained comparison example above would be able to return:: >>> 0 < increasing < 4 ComparisonResultArray([ False, True, True, True, False], dtype=bool) Existence checking expressions ------------------------------ An increasingly common requirement in modern software development is the need to work with "semi-structured data": data where the structure of the data is known in advance, but pieces of it may be missing at runtime, and the software manipulating that data is expected to degrade gracefully (e.g. by omitting results that depend on the missing data) rather than failing outright. Some particularly common cases where this issue arises are: * handling optional application configuration settings and function parameters * handling external service failures in distributed systems * handling data sets that include some partial records At the moment, writing such software in Python can be genuinely awkward, as your code ends up littered with expressions like: * ``value1 = expr1.field.of.interest if expr1 is not None else None`` * ``value2 = expr2["field"]["of"]["interest"] if expr2 is not None else None`` * ``value3 = expr3 if expr3 is not None else expr4 if expr4 is not None else expr5`` PEP 531 goes into more detail on some of the challenges of working with this kind of data, particularly in data transformation pipelines where dealing with potentially missing content is the norm rather than the exception. The combined impact of the proposals in this PEP is to allow the above sample expressions to instead be written as: * ``value1 = missing(expr1) else expr1.field.of.interest`` * ``value2 = missing(expr2) else expr2.["field"]["of"]["interest"]`` * ``value3 = exists(expr3) else exists(expr4) else expr5`` In these forms, significantly more of the text presented to the reader is immediately relevant to the question "What does this code do?", while the boilerplate code to handle missing data by passing it through to the output or falling back to an alternative input, has shrunk to two uses of the new ``missing`` builtin, and two uses of the new ``exists`` builtin. In the first two examples, the 31 character boilerplate suffix ``if exprN is not None else None`` (minimally 27 characters for a single letter variable name) has been replaced by a 19 character ``missing(expr1) else`` prefix (minimally 15 characters with a single letter variable name), markedly improving the signal-to-pattern-noise ratio of the lines (especially if it encourages the use of more meaningful variable and field names rather than making them shorter purely for the sake of expression brevity). The additional syntactic sugar proposals in PEP 505 would further reduce this boilerplate to a single ``?`` character that also eliminated the repetition of the expession being checked for existence. In the last example, not only are two instances of the 21 character boilerplate, `` if exprN is not None`` (minimally 17 characters) replaced with the 8 character function call ``exists()``, but that function call is placed directly around the original expression, eliminating the need to duplicate it in the conditional existence check. Risks and concerns ================== This PEP has been designed specifically to address the risks and concerns raised when discussing PEPs 335, 505 and 531. * it defines a new operator and adjusts the definition of chained comparison rather than impacting the existing ``and`` and ``or`` operators * the changes to the ``not`` unary operator are defined in such a way that control flow optimisations based on the existing semantics remain valid * rather than the cryptic ``??``, it uses ``else`` as the operator keyword in exactly the same sense as it is already used in conditional expressions * it defines a general purpose short-circuiting binary operator that can even be used to express the existing semantics of ``and`` and ``or`` rather than focusing solely and inflexibly on existence checking * it names the proposed builtins in such a way that they provide a strong mnemonic hint as to when the expression containing them will short-circuit and skip evaluating the right operand Possible confusion with conditional expressions ----------------------------------------------- The proposal in this PEP is essentially for an "implied ``if``" where if you omit the ``if`` clause from a conditional expression, you invoke the circuit breaking protocol instead. That is:: exists(foo) else calculate_default() invokes the new protocol, but:: foo.field.of.interest if exists(foo) else calculate_default() bypasses it entirely, *including* the non-short-circuiting ``__else__`` method. This mostly wouldn't be a problem for the proposed ``types.CircuitBreaker`` implementation (and hence the ``exists`` and ``missing`` builtins), as the only purpose the extended protocol serves in that case is to remove the wrapper in the short-circuiting case - the ``__else__`` method passes the right operand through unchanged. However, this discrepancy could potentially be eliminated entirely by also updating conditional expressions to use the circuit breaking protocol if the condition defines those methods. In that case, ``__then__`` would need to be updated to accept the left operand as a parameter, with short-circuiting indicated by passing in the circuit breaker itself:: class CircuitBreaker: """Base circuit breaker type (available as types.CircuitBreaker)""" def __init__(self, value, condition, inverse_type): self.value = value self._condition = condition self._inverse_type = inverse_type def __bool__(self): return self._condition def __not__(self): return self._inverse_type(self.value) def __then__(self, other): if other is not self: return other return self.value # Short-circuit, remove the wrapper def __else__(self, other): if other is not self: return other return self.value # Short-circuit, remove the wrapper With this symmetric protocol, the definition of conditional expressions could be updated to also make the ``else`` clause optional:: test: else_test ['if' or_test ['else' test]] | lambdef else_test: or_test ['else' test] (We would avoid the apparent simplification to ``else_test ('if' else_test)*`` in order to make it easier to correctly preserve the semantics of normal conditional expressions) Given that expanded definition, the following statements would be functionally equivalent:: foo = calculate_default() if missing(foo) foo = calculate_default() if foo is None else foo Just as the base proposal already makes the following equivalent:: foo = exists(foo) else calculate_default() foo = foo if foo is not None else calculate_default() The ``if`` based circuit breaker form has the virtue of reading significantly better when used for conditional imperative commands like debug messages:: print(some_expensive_query()) if verbosity > 2 If we went down this path, then ``operator.true`` would need to be declared as the nominal implicit circuit breaker when the condition didn't define the circuit breaker protocol itself (so the above example would produce ``None`` if the debugging message was printed, and ``False`` otherwise) The main objection to this expansion of the proposal is that it makes it a more intrusive change that may potentially affect the behaviour of existing code, while the main point in its favour is that allowing both ``if`` and ``else`` as circuit breaking operators and also supporting the circuit breaking protocol for normal conditional expressions would be significantly more self-consistent than special-casing a bare ``else`` as a stand-alone operator. Design Discussion ================= Arbitrary sentinel objects -------------------------- Unlike PEPs 505 and 531, this proposal readily handles custom sentinel objects:: class defined(types.CircuitBreaker): MISSING = object() def __init__(self, value): super().__init__(self, value is not self.MISSING, undefined) class undefined(types.CircuitBreaker): def __init__(self, value): super().__init__(self, value is defined.MISSING, defined) # Using the sentinel to check whether or not an argument was supplied def my_func(arg=defined.MISSING): arg = defined(arg) else calculate_default() Implementation ============== As with PEP 505, actual implementation has been deferred pending in-principle interest in the idea of making these changes - aside from the possible syntactic ambiguity concerns covered by the grammer proposals above, the implementation isn't really the hard part of these proposals, the hard part is deciding whether or not this is a change where the long term benefits for new and existing Python users outweigh the short term costs involved in the wider ecosystem (including developers of other implementations, language curriculum developers, and authors of other Python related educational material) adjusting to the change. ...TBD... Acknowledgements ================ Thanks go to Mark E. Haase for feedback on and contributions to earlier drafts of this proposal. However, his clear and exhaustive explanation of the original protocol design that modified the semantics of ``if-else`` conditional expressions to use an underlying ``__then__``/``__else__`` protocol helped convince me it was too complicated to keep, so this iteration contains neither that version of the protocol, nor Mark's explanation of it. References ========== .. [1] PEP 335 rejection notification (http://mail.python.org/pipermail/python-dev/2012-March/117510.html) Copyright ========= This document has been placed in the public domain under the terms of the CC0 1.0 license: https://creativecommons.org/publicdomain/zero/1.0/ -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Thanks, Nick. This PEP is far easier to read than the previous one. On 05.11.2016 10:50, Nick Coghlan wrote:
The "else" keyword I proposed more than a year ago finally lands in a PEP. So, I am +1 on this. ;-)
This is why I like this PEP most.
If you are referring to "missing" and to "exists", I tend to dislike them as production code use those very, very often. I think that the mnemonic hint comes from the "else" keyword, not from the proposed builtins. Furthermore, I still don't find "missing(expr1) else expr1.field" very intuitive to read. Regarding the ?. and ?[] syntactic sugar definitions, they seem very well explained. Unfortunately, having a look at our production code (configs etc.) specifically, it wouldn't help so much as we usually check for attribute existence not obj existence. So, ?. and ?[] define in the proposed form might help 50% of the use-cases but would disappoint those other 50%. Cheers, Sven

On Sat, Nov 05, 2016 at 07:50:44PM +1000, Nick Coghlan wrote:
This feels much more promising to me, but I'm still not quite convinced, possibly because it's a very complicated proposal. I've had to read it multiple times to really understand it. I wonder whether part of the difficulty is the size of the proposal. Perhaps this should be split into two PEPs: one to describe the protocol alone, and a second to propose built-ins and new syntax that takes advantage of the protocol. That might help keep this proposal in digestible pieces. (By the way, I really like the "circuit breaker" name.)
These built-in names are quite problematic. I've seen people on Reddit and elsewhere who understood this as a form of "is this variable defined?" which is badly wrong. That is, they understood: exists(foo) else bar as being equivalent to: try: return foo except NameError: # variable foo doesn't exist/isn't defined return bar So I think that is going to be an area of confusion. I'm not really keen on these names for the proposed built-ins. The names are much too generic for what they actually do, which is test for None, not just some generic idea of "existence". I don't really have a better solution if your aim is to dispense with the ?? operator. Some (bad) ideas: isnone, is_none, nullity. None of these are as compact as the ?? operator. On the other hand, if we keep the ?? operator, and implicitly define it as: lhs_expression ?? rhs_expression => exists(lhs_expression) else rhs_expression then there's no need for the user to explicitly write out exists() in their code, and the name can be as precise and as long as we need: # ?? operator types.NoneCoalesingType(lhs_expression) else rhs_expression
My first reaction on reading that was to wonder if you had written the first two terms backwards. My reading was that *any* truthy value would trigger the __then__ method call, and any falsey value the __else__ call, which would make this (almost) just another spelling of the existing `or` operator. My thought was that people could write: None else 2 which would return NoneType.__else__(None, 2), which I presumed would be 2. But I now think that's wrong. I thought that the intention was to give `object` default __then__ and __else__ methods, and over-ride them as needed, but I now think that's wrong. I think that should be clarified early in the PEP, as I wasted a lot of time thinking about this wrongly. So I now think that most objects will not define __then__ and __else__, and consequently code like `None else 2` will raise an error. Is that right? I now think that only the builtins `exists` and `missing`, and the CircuitBreaker actually need __then__ and __else__ methods. (Plus of course any user-defined types.) [...]
I take it that your intention is to avoid the ?? None-coalescing operator. Your exists() version risks being misunderstood as testing for the existence of the expression/name (e.g. catching NameError, LookupError), and it's still quite verbose compared to the ?? operator but without being any more explicit in what it is doing. # your exists() proposal value = exists(expr1) else exists(expr2) else expr3 # ?? operator value = expr1 ?? expr2 ?? expr3 [...]
So how will this effect the existing semantics of `not`? Currently, I think `not obj` is equivalent to `not bool(obj)`, where bool calls the __bool__ method, or __len__ if that's not defined, and falls back to True. I think your proposal means that this will change to something close to this: __not__ = getattr(type(obj), '__not__', None) if __not__ is not None: x = __not__(obj) if x is not NotImplemented: return x flag = bool(obj) return False if flag else True That means that now any object, not just Circuit Breakers, can customise how they respond to `not`. [...]
That's a point in its favour.
* the changes to the ``not`` unary operator are defined in such a way that control flow optimisations based on the existing semantics remain valid
I think the changes to `not` are neutral.
* rather than the cryptic ``??``, it uses ``else`` as the operator keyword in exactly the same sense as it is already used in conditional expressions
I don't think that ?? is really cryptic. Like any symbol, it has to be learned, but ?? is used in a number of other major languages. Its also obviously related to the ?. and ?[] sugar, so once people learn that they are related to None testing they will have a strong clue that ?? is too. I completely accept that it is not self-explanatory. That's the cost of symbols: they have to be learned, because they aren't self-explanatory in the way that well-named functions may be. But many things need to be learned, including such fundamental necessary (pseudo-)operators as . for attribute lookup and [] for item lookup, as well as maths operators like ** etc. To fairly label a symbol as "cryptic", I would expect that their effects are mysterious, hard to understand or complicated. "await" and "async" are cryptic because to understand them, you have to grok asyncronous programming, and that's hard. Nevertheless, they're important enough that they deserve to be syntax, hard to understand or not. But ?? is not hard to understand. It has a very simple explanation: expr1 ?? expr2 is just equivalent to _tmp = expr1 _tmp if _tmp is not None else expr2 except it doesn't create a temporary variable.
Another point in its favour.
This is my biggest problem with your proposal. I just don't think this part is correct. I think that the exists() and missing() builtins are the weakest part of the proposal: - exists() can be easily misunderstood as testing for NameError or LookupError, rather than whether the value is None; - likewise for missing(), in reverse. The problem here is that exists() *seems* to be so self-evident and obvious that there's no need to read the documentation in any detail. It's actually misleading -- while it is true that `is None` is a special case of existence checking, for applications that treat None as a missing value, that's not what most people expect "existence" to mean, and I'm not convinced that those coming from languages with a ?? operator think of it as existence checking either. That's why they call it Null Coalesing rather than Existence Checking. Whereas ?? is just unfamiliar enough to keep the user on their toes and discourage them from making assumptions. If they know the ?? operator from another language, there won't be any big surprises for them. If they don't, they should find it documented as both the actual implementation in terms of the `else` circuit breaker (for experts!) and as a conceptually simpler form in terms of if...else. * * * Whew! Nick, this is a big, complex PEP, and thank you for taking the time to write it. But its also hard to understand -- there's a lot of detail, and there are places where it is easy for the reader to get mislead by their assumptions and only get corrected deep, deep into the PEP. At least that's my experience. I think I'd find this PEP easier to understand if it were split into two, like Guido's type-hints PEPs: one to explain the protocol, and one to detail the new built-ins and syntactic sugar which rely on the protocol. Or maybe that's just me. I really like this PEP as a way of introducing configurable short- circuiting behaviour, without messing with `and` and `or`. That's really, really nice. I like your decision to keep the ?. and ?[] sugar, as short-cuts for code based on this Circuit Breaker protocol. But I cannot support the exists() and missing() builtins as they stand. I think the circuit breakers themselves work well as a concept, but: - I think that the names will be more harmful than helpful; - I don't think that having to explicitly call a circuit breaker is a good substitute for the ?? operator. If I absolutely had to choose between this and nothing, I'd say +1 for this. But if I had to choose between ?? as a operator, and a generic circuit breaking protocol with no operator but exists() builtin instead, well, that would be a really hard decision. -- Steve

On 9 November 2016 at 02:15, Steven D'Aprano <steve@pearwood.info> wrote:
Whew! Nick, this is a big, complex PEP, and thank you for taking the time to write it.
Thanks :)
This is another reason why I think the symmetric proposal I mentioned in the Risk and Concerns section may actually be a better idea, since it aligns better with the full "LHS if COND else RHS" spelling. It does make the short-circuiting more complex to explain, but if I went back to that I'd also reinstate Mark's explanation of how the short-circuiting would work and the associated diagram.
It's hard to explain the protocol without concrete examples to draw on, and `operator.true` and `operator.false` aren't really sufficient for that purpose (since they're more likely to elicit a reaction of "But that's the way conditional expressions work anyway...").
I did consider the possible misinterpretation in terms of NameError, but probably didn't give it sufficient credence (since it isn't going to be immediately obvious to everyone that there's no way to readily delegate that check to a callable of any kind in Python, especially when other languages do offer that kind of "undefined" check)
- I don't think that having to explicitly call a circuit breaker is a good substitute for the ?? operator.
I think the "line signal to noise ratio" measures in the examples do a pretty decent job of highlighting that - the patterns involved are still quite verbose when using a named circuit breaker. In the next revision, I'll update the PEP to say it doesn't compete with 505 at all, and merely offers a proposal for making that PEP a semantically protocol based one.
One of the reasons I was keen to get this written relatively quickly (aside from wanting to see for myself whether or not it actually hung together as a coherent design concept) was so we'd have plenty of time to consider the risks, opportunities and alternatives between now and 3.7. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

I like PEP-532. Given the opposition to non-Pythonic syntax like ?? (i.e. PEP-505), Nick's proposal offers a Pythonic alternative that is protocol based, more generalized, and uses built-ins and keywords to avoid punctuation. I agree with other posters that the terms "exists" and "missing" could lead developers to think that it tests for NameError. Maybe "value(foo) else bar"? I can't think of a better spelling for the inverse. Maybe the "if" syntax described in the PEP is better: "foo.bar if value(foo)". In that case, we wouldn't need an inverse to exists()/value()/whatever. I also wanted to mention a couple of unmentioned benefits of this PEP: 1. It is easier to Google a name. E.g., Google "c# ??" and you'll get nothing related to null coalescing in c#". ("C# question marks" does find the right content, however.) 2. Dismay over the meaning of foo?.bar.baz is much clearer when expressed as missing(foo) else foo.bar.baz -- it's very close to the ternary logic you'd write if you didn't have a circuit breaking operator: None if foo is None else foo.bar.baz. On Sat, Nov 5, 2016 at 5:50 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:

On 11/12/16, Mark E. Haase <mehaase@gmail.com> wrote:
python has nice (*) help system and would have help('??')... (where we could get better keywords to search on web - for example "None coalescing operator") (*) although it could be better: 1. for example help('**') show nothing about unpacking. help('+') show info about operator precedence and not about +. 2. there is not info about dunder methods ...

Thanks, Nick. This PEP is far easier to read than the previous one. On 05.11.2016 10:50, Nick Coghlan wrote:
The "else" keyword I proposed more than a year ago finally lands in a PEP. So, I am +1 on this. ;-)
This is why I like this PEP most.
If you are referring to "missing" and to "exists", I tend to dislike them as production code use those very, very often. I think that the mnemonic hint comes from the "else" keyword, not from the proposed builtins. Furthermore, I still don't find "missing(expr1) else expr1.field" very intuitive to read. Regarding the ?. and ?[] syntactic sugar definitions, they seem very well explained. Unfortunately, having a look at our production code (configs etc.) specifically, it wouldn't help so much as we usually check for attribute existence not obj existence. So, ?. and ?[] define in the proposed form might help 50% of the use-cases but would disappoint those other 50%. Cheers, Sven

On Sat, Nov 05, 2016 at 07:50:44PM +1000, Nick Coghlan wrote:
This feels much more promising to me, but I'm still not quite convinced, possibly because it's a very complicated proposal. I've had to read it multiple times to really understand it. I wonder whether part of the difficulty is the size of the proposal. Perhaps this should be split into two PEPs: one to describe the protocol alone, and a second to propose built-ins and new syntax that takes advantage of the protocol. That might help keep this proposal in digestible pieces. (By the way, I really like the "circuit breaker" name.)
These built-in names are quite problematic. I've seen people on Reddit and elsewhere who understood this as a form of "is this variable defined?" which is badly wrong. That is, they understood: exists(foo) else bar as being equivalent to: try: return foo except NameError: # variable foo doesn't exist/isn't defined return bar So I think that is going to be an area of confusion. I'm not really keen on these names for the proposed built-ins. The names are much too generic for what they actually do, which is test for None, not just some generic idea of "existence". I don't really have a better solution if your aim is to dispense with the ?? operator. Some (bad) ideas: isnone, is_none, nullity. None of these are as compact as the ?? operator. On the other hand, if we keep the ?? operator, and implicitly define it as: lhs_expression ?? rhs_expression => exists(lhs_expression) else rhs_expression then there's no need for the user to explicitly write out exists() in their code, and the name can be as precise and as long as we need: # ?? operator types.NoneCoalesingType(lhs_expression) else rhs_expression
My first reaction on reading that was to wonder if you had written the first two terms backwards. My reading was that *any* truthy value would trigger the __then__ method call, and any falsey value the __else__ call, which would make this (almost) just another spelling of the existing `or` operator. My thought was that people could write: None else 2 which would return NoneType.__else__(None, 2), which I presumed would be 2. But I now think that's wrong. I thought that the intention was to give `object` default __then__ and __else__ methods, and over-ride them as needed, but I now think that's wrong. I think that should be clarified early in the PEP, as I wasted a lot of time thinking about this wrongly. So I now think that most objects will not define __then__ and __else__, and consequently code like `None else 2` will raise an error. Is that right? I now think that only the builtins `exists` and `missing`, and the CircuitBreaker actually need __then__ and __else__ methods. (Plus of course any user-defined types.) [...]
I take it that your intention is to avoid the ?? None-coalescing operator. Your exists() version risks being misunderstood as testing for the existence of the expression/name (e.g. catching NameError, LookupError), and it's still quite verbose compared to the ?? operator but without being any more explicit in what it is doing. # your exists() proposal value = exists(expr1) else exists(expr2) else expr3 # ?? operator value = expr1 ?? expr2 ?? expr3 [...]
So how will this effect the existing semantics of `not`? Currently, I think `not obj` is equivalent to `not bool(obj)`, where bool calls the __bool__ method, or __len__ if that's not defined, and falls back to True. I think your proposal means that this will change to something close to this: __not__ = getattr(type(obj), '__not__', None) if __not__ is not None: x = __not__(obj) if x is not NotImplemented: return x flag = bool(obj) return False if flag else True That means that now any object, not just Circuit Breakers, can customise how they respond to `not`. [...]
That's a point in its favour.
* the changes to the ``not`` unary operator are defined in such a way that control flow optimisations based on the existing semantics remain valid
I think the changes to `not` are neutral.
* rather than the cryptic ``??``, it uses ``else`` as the operator keyword in exactly the same sense as it is already used in conditional expressions
I don't think that ?? is really cryptic. Like any symbol, it has to be learned, but ?? is used in a number of other major languages. Its also obviously related to the ?. and ?[] sugar, so once people learn that they are related to None testing they will have a strong clue that ?? is too. I completely accept that it is not self-explanatory. That's the cost of symbols: they have to be learned, because they aren't self-explanatory in the way that well-named functions may be. But many things need to be learned, including such fundamental necessary (pseudo-)operators as . for attribute lookup and [] for item lookup, as well as maths operators like ** etc. To fairly label a symbol as "cryptic", I would expect that their effects are mysterious, hard to understand or complicated. "await" and "async" are cryptic because to understand them, you have to grok asyncronous programming, and that's hard. Nevertheless, they're important enough that they deserve to be syntax, hard to understand or not. But ?? is not hard to understand. It has a very simple explanation: expr1 ?? expr2 is just equivalent to _tmp = expr1 _tmp if _tmp is not None else expr2 except it doesn't create a temporary variable.
Another point in its favour.
This is my biggest problem with your proposal. I just don't think this part is correct. I think that the exists() and missing() builtins are the weakest part of the proposal: - exists() can be easily misunderstood as testing for NameError or LookupError, rather than whether the value is None; - likewise for missing(), in reverse. The problem here is that exists() *seems* to be so self-evident and obvious that there's no need to read the documentation in any detail. It's actually misleading -- while it is true that `is None` is a special case of existence checking, for applications that treat None as a missing value, that's not what most people expect "existence" to mean, and I'm not convinced that those coming from languages with a ?? operator think of it as existence checking either. That's why they call it Null Coalesing rather than Existence Checking. Whereas ?? is just unfamiliar enough to keep the user on their toes and discourage them from making assumptions. If they know the ?? operator from another language, there won't be any big surprises for them. If they don't, they should find it documented as both the actual implementation in terms of the `else` circuit breaker (for experts!) and as a conceptually simpler form in terms of if...else. * * * Whew! Nick, this is a big, complex PEP, and thank you for taking the time to write it. But its also hard to understand -- there's a lot of detail, and there are places where it is easy for the reader to get mislead by their assumptions and only get corrected deep, deep into the PEP. At least that's my experience. I think I'd find this PEP easier to understand if it were split into two, like Guido's type-hints PEPs: one to explain the protocol, and one to detail the new built-ins and syntactic sugar which rely on the protocol. Or maybe that's just me. I really like this PEP as a way of introducing configurable short- circuiting behaviour, without messing with `and` and `or`. That's really, really nice. I like your decision to keep the ?. and ?[] sugar, as short-cuts for code based on this Circuit Breaker protocol. But I cannot support the exists() and missing() builtins as they stand. I think the circuit breakers themselves work well as a concept, but: - I think that the names will be more harmful than helpful; - I don't think that having to explicitly call a circuit breaker is a good substitute for the ?? operator. If I absolutely had to choose between this and nothing, I'd say +1 for this. But if I had to choose between ?? as a operator, and a generic circuit breaking protocol with no operator but exists() builtin instead, well, that would be a really hard decision. -- Steve

On 9 November 2016 at 02:15, Steven D'Aprano <steve@pearwood.info> wrote:
Whew! Nick, this is a big, complex PEP, and thank you for taking the time to write it.
Thanks :)
This is another reason why I think the symmetric proposal I mentioned in the Risk and Concerns section may actually be a better idea, since it aligns better with the full "LHS if COND else RHS" spelling. It does make the short-circuiting more complex to explain, but if I went back to that I'd also reinstate Mark's explanation of how the short-circuiting would work and the associated diagram.
It's hard to explain the protocol without concrete examples to draw on, and `operator.true` and `operator.false` aren't really sufficient for that purpose (since they're more likely to elicit a reaction of "But that's the way conditional expressions work anyway...").
I did consider the possible misinterpretation in terms of NameError, but probably didn't give it sufficient credence (since it isn't going to be immediately obvious to everyone that there's no way to readily delegate that check to a callable of any kind in Python, especially when other languages do offer that kind of "undefined" check)
- I don't think that having to explicitly call a circuit breaker is a good substitute for the ?? operator.
I think the "line signal to noise ratio" measures in the examples do a pretty decent job of highlighting that - the patterns involved are still quite verbose when using a named circuit breaker. In the next revision, I'll update the PEP to say it doesn't compete with 505 at all, and merely offers a proposal for making that PEP a semantically protocol based one.
One of the reasons I was keen to get this written relatively quickly (aside from wanting to see for myself whether or not it actually hung together as a coherent design concept) was so we'd have plenty of time to consider the risks, opportunities and alternatives between now and 3.7. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

I like PEP-532. Given the opposition to non-Pythonic syntax like ?? (i.e. PEP-505), Nick's proposal offers a Pythonic alternative that is protocol based, more generalized, and uses built-ins and keywords to avoid punctuation. I agree with other posters that the terms "exists" and "missing" could lead developers to think that it tests for NameError. Maybe "value(foo) else bar"? I can't think of a better spelling for the inverse. Maybe the "if" syntax described in the PEP is better: "foo.bar if value(foo)". In that case, we wouldn't need an inverse to exists()/value()/whatever. I also wanted to mention a couple of unmentioned benefits of this PEP: 1. It is easier to Google a name. E.g., Google "c# ??" and you'll get nothing related to null coalescing in c#". ("C# question marks" does find the right content, however.) 2. Dismay over the meaning of foo?.bar.baz is much clearer when expressed as missing(foo) else foo.bar.baz -- it's very close to the ternary logic you'd write if you didn't have a circuit breaking operator: None if foo is None else foo.bar.baz. On Sat, Nov 5, 2016 at 5:50 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:

On 11/12/16, Mark E. Haase <mehaase@gmail.com> wrote:
python has nice (*) help system and would have help('??')... (where we could get better keywords to search on web - for example "None coalescing operator") (*) although it could be better: 1. for example help('**') show nothing about unpacking. help('+') show info about operator precedence and not about +. 2. there is not info about dunder methods ...
participants (5)
-
Mark E. Haase
-
Nick Coghlan
-
Pavol Lisy
-
Steven D'Aprano
-
Sven R. Kunze