On 15 October 2016 at 13:36, Guido van Rossum firstname.lastname@example.org wrote:
I'm not usually swayed by surveys -- Python is not a democracy. Maybe a bunch of longer examples would help all of us see the advantages of the proposals.
Having been previously somewhere between -1 and -0, I've been doing a lot more data mining and analysis work lately, which has been enough to shift me to at least +0 and potentially even higher when it comes to the utility of adding these operators (more on that below).
= Pragmatic aspects =
Regarding the spelling details, my current preferences are as follows:
* None-coalescing operator: x ?or y * None-severing operator: x ?and y * None-coalescing augmented assignment: x ?= y * None-severing attribute access: x?.attr * None-severing subscript lookup: x?[expr]
(The PEP currently only covers the "or?" and "and?" operator spelling suggestions, but the latter three suggestions are the same as those in the current PEP draft)
My rationale for this preference is that it means that "?" is consistently a pseudo-operator that accepts an expression on the left and another binary operator (from a carefully restricted subset) on the right, and the combination is a new short-circuiting binary operation based on "LHS is not None".
The last three operations can be defined in terms of the first two (with the usual benefit of avoiding repeated evaluation of the subexpression):
* None-coalescing augmented assignment: x = x ?or y * None-severing attribute access: x ?and x.attr * None-severing subscript lookup: x ?and x[expr]
The first two can then be defined in terms of equivalent if/else statements containing an "x is not None" clause:
* None-coalescing operator: x if x is not None else y * None-severing operator: y if x is not None else x
Importantly, the normal logical and/or can be expanded in terms of if/else in exactly the same way, only using "bool(x)" instead of "x is not None":
* Logical or: x if x else y * Logical and: y if x else x
= Language design philosophy aspects =
Something I think is missing from the current PEP is a high level explanation of the *developer problem* that these operators solve - while the current PEP points to other languages as precedent, that just prompts the follow on question "Well, why did *they* add them, and does their rationale also apply to Python?". Even the current motivating examples don't really cover this, as they're quite tactical in nature ("Here is how this particular code is improved by the proposed change"), rather than explaining the high level user benefit ("What has changed in the surrounding technology environment that makes us think this is a user experience design problem worth changing the language definition to help address *now* even though Python has lived happily without these operators for 25+ years?")
With conditional expressions, we had the clear driver that folks were insisting on using (and teaching!) the "and/or" hack as a workaround, and introducing bugs into their code as a result, whereas we don't have anything that clear-cut for this proposal (using "or" for None-coalescing doesn't seem to be anywhere near as popular as "and/or" used to be as an if/else equivalent).
My point of view on that is that one of the biggest computing trends in recent years is the rise of "semi-structured data", where you're passing data around in either JSON-compatible data structures, or comparable structures mapped to instances and attributes, and all signs point to that being a permanent state change in the world of programming rather than merely being a passing fad. The world itself is fuzzy and ambiguous, and learning to work effectively with semi-structured data better reflects that ambiguity rather than forcing a false precision for the sake of code simplification. When you're working in that kind of context, encountering "None" is typically a shorthand for "This entire data subtree is missing, so don't try to do anything with it", but if it *isn't* None, you can safely assume that all the mandatory parts of that data segment will be present (no matter how deeply nested they are).
To help explain that, it would be useful to mention not only the corresponding operators in other languages, but also the changes in data storage practices, like PostgreSQL's native support for JSON document storage and querying ( https://www.postgresql.org/docs/9.4/static/functions-json.html ) as well as the emergence/resurgence of hierarchical document storage techniques and new algorithms for working with them.
However, it's also the case that where we *do* have a well understood and nicely constrained problem, it's still better to complain loudly when data is unexpectedly missing, rather than subjecting ourselves to the pain of having to deal with detecting problems with our data far away from where we introduced those problems. A *lot* of software still falls into that category, especially custom software written to meet the needs of one particular organisation.
My current assumption is that those of us that now regularly need to deal with semi-structured data are thinking "Yes, these additions are obviously beneficial and improve Python's expressiveness, if we can find an acceptable spelling". Meanwhile, folks dealing primarily with entirely structured or entirely unstructured data are scratching their heads and asking "What's the big deal? How could it ever be worth introducing more line noise into the language just to make this kind of code easier to write?"
Even the PEP's title is arguably a problem on that front - "None-aware operators" is a proposed *solution* to the problem of making semi-structured data easier to work with in Python, and hence reads like a solution searching for a problem to folks that don't regularly encounter these issues themselves.
Framing the problem that way also provides a hint on how we could *document* these operations in the language reference in a readily comprehensible way: "Operators for working with semi-structured data"