[Python-ideas] Null coalescing operator

Nick Coghlan ncoghlan at gmail.com
Sat Oct 15 02:10:57 EDT 2016


On 15 October 2016 at 13:36, Guido van Rossum <guido at python.org> wrote:
> I'm not usually swayed by surveys -- Python is not a democracy. Maybe
> a bunch of longer examples would help all of us see the advantages of
> the proposals.

Having been previously somewhere between -1 and -0, I've been doing a
lot more data mining and analysis work lately, which has been enough
to shift me to at least +0 and potentially even higher when it comes
to the utility of adding these operators (more on that below).

= Pragmatic aspects =

Regarding the spelling details, my current preferences are as follows:

* None-coalescing operator: x ?or y
* None-severing operator: x ?and y
* None-coalescing augmented assignment: x ?= y
* None-severing attribute access: x?.attr
* None-severing subscript lookup: x?[expr]

(The PEP currently only covers the "or?" and "and?" operator spelling
suggestions, but the latter three suggestions are the same as those in
the current PEP draft)

My rationale for this preference is that it means that "?" is
consistently a pseudo-operator that accepts an expression on the left
and another binary operator (from a carefully restricted subset) on
the right, and the combination is a new short-circuiting binary
operation based on "LHS is not None".

The last three operations can be defined in terms of the first two
(with the usual benefit of avoiding repeated evaluation of the
subexpression):

* None-coalescing augmented assignment: x = x ?or y
* None-severing attribute access: x ?and x.attr
* None-severing subscript lookup: x ?and x[expr]

The first two can then be defined in terms of equivalent if/else
statements containing an "x is not None" clause:

* None-coalescing operator: x if x is not None else y
* None-severing operator: y if x is not None else x

Importantly, the normal logical and/or can be expanded in terms of
if/else in exactly the same way, only using "bool(x)" instead of "x is
not None":

* Logical or: x if x else y
* Logical and: y if x else x

= Language design philosophy aspects =

Something I think is missing from the current PEP is a high level
explanation of the *developer problem* that these operators solve -
while the current PEP points to other languages as precedent, that
just prompts the follow on question "Well, why did *they* add them,
and does their rationale also apply to Python?". Even the current
motivating examples don't really cover this, as they're quite tactical
in nature ("Here is how this particular code is improved by the
proposed change"), rather than explaining the high level user benefit
("What has changed in the surrounding technology environment that
makes us think this is a user experience design problem worth changing
the language definition to help address *now* even though Python has
lived happily without these operators for 25+ years?")

With conditional expressions, we had the clear driver that folks were
insisting on using (and teaching!) the "and/or" hack as a workaround,
and introducing bugs into their code as a result, whereas we don't
have anything that clear-cut for this proposal (using "or" for
None-coalescing doesn't seem to be anywhere near as popular as
"and/or" used to be as an if/else equivalent).

My point of view on that is that one of the biggest computing trends
in recent years is the rise of "semi-structured data", where you're
passing data around in either JSON-compatible data structures, or
comparable structures mapped to instances and attributes, and all
signs point to that being a permanent state change in the world of
programming rather than merely being a passing fad. The world itself
is fuzzy and ambiguous, and learning to work effectively with
semi-structured data better reflects that ambiguity rather than
forcing a false precision for the sake of code simplification. When
you're working in that kind of context, encountering "None" is
typically a shorthand for "This entire data subtree is missing, so
don't try to do anything with it", but if it *isn't* None, you can
safely assume that all the mandatory parts of that data segment will
be present (no matter how deeply nested they are).

To help explain that, it would be useful to mention not only the
corresponding operators in other languages, but also the changes in
data storage practices, like PostgreSQL's native support for JSON
document storage and querying (
https://www.postgresql.org/docs/9.4/static/functions-json.html ) as
well as the emergence/resurgence of hierarchical document storage
techniques and new algorithms for working with them.

However, it's also the case that where we *do* have a well understood
and nicely constrained problem, it's still better to complain loudly
when data is unexpectedly missing, rather than subjecting ourselves to
the pain of having to deal with detecting problems with our data far
away from where we introduced those problems. A *lot* of software
still falls into that category, especially custom software written to
meet the needs of one particular organisation.

My current assumption is that those of us that now regularly need to
deal with semi-structured data are thinking "Yes, these additions are
obviously beneficial and improve Python's expressiveness, if we can
find an acceptable spelling". Meanwhile, folks dealing primarily with
entirely structured or entirely unstructured data are scratching their
heads and asking "What's the big deal? How could it ever be worth
introducing more line noise into the language just to make this kind
of code easier to write?"

Even the PEP's title is arguably a problem on that front - "None-aware
operators" is a proposed *solution* to the problem of making
semi-structured data easier to work with in Python, and hence reads
like a solution searching for a problem to folks that don't regularly
encounter these issues themselves.

Framing the problem that way also provides a hint on how we could
*document* these operations in the language reference in a readily
comprehensible way: "Operators for working with semi-structured data"

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list