On 15.10.2016 08:10, Nick Coghlan wrote:
However, it's also the case that where we *do* have a well understood and nicely constrained problem, it's still better to complain loudly when data is unexpectedly missing, rather than subjecting ourselves to the pain of having to deal with detecting problems with our data far away from where we introduced those problems. A *lot* of software still falls into that category, especially custom software written to meet the needs of one particular organisation.
Definitely true. Stricter rules are similar to "fail early", "no errors should pass silently" and the like. This stance is conveyed by Python as long as I know it.
My current assumption is that those of us that now regularly need to deal with semi-structured data are thinking "Yes, these additions are obviously beneficial and improve Python's expressiveness, if we can find an acceptable spelling". Meanwhile, folks dealing primarily with entirely structured or entirely unstructured data are scratching their heads and asking "What's the big deal? How could it ever be worth introducing more line noise into the language just to make this kind of code easier to write?"
That's where I like to see a common middle ground between those two sides of the table.
I need to work with both sides for years now. In my experience, it's best to avoid semi-structured data at all to keep the code simple. As we all know and as you described, the world isn't perfect and I can only agree. However, what served us best in recent years, is to keep the "semi-" out of the inner workings of our codebase. So, handling "semi-" at the system boundary proved to be a reliable way of not breaking everything and of keeping our devs sane.
I am unsure how to implement such solution, whether via PEP8 or via the proposal's PEP. It somehow reminds me of the sans-IO idea where the core logic should be simple/linear code and the difficult/problematic issues are solved at the systems boundary.
This said, let me put it differently by using an example. I can find None-aware operators very useful at the outermost function/methods of a process/library/class/module:
class FanzyTool: def __init__(self, option1=None, option2=None, ...): # what happens when option6 and option7 are None # and it only matters when option 3 is not None # but when ...
Internal function/methods/modules/classes and even processes/threads should have a clear, non-wishy-washy way of input and output (last but not least also to do unit-testing on relatively sane level).
def _append_x(self, s): return s + 'x' # strawman operation
Imagine, that s is something important to be passed around many times inside of "FanzyTool". The whole process usually makes no sense at all, when s is None. And having each internal method checking for None is getting messy fast.
I hope we can also convey this issue properly when we find an appropriate syntax.
Even the PEP's title is arguably a problem on that front - "None-aware operators" is a proposed *solution* to the problem of making semi-structured data easier to work with in Python, and hence reads like a solution searching for a problem to folks that don't regularly encounter these issues themselves.
Framing the problem that way also provides a hint on how we could *document* these operations in the language reference in a readily comprehensible way: "Operators for working with semi-structured data"
That's indeed an extraordinarily good title as it describes best what we intend it to be used for (common usage scenarios). +1