On Mon, Feb 03, 2020 at 05:26:38PM -0800, Sebastian Berg wrote:
- `==` has of course the logic `NaN == NaN -> False`
- `PyObject_RichCompareBool(a, b, Py_EQ)` was argued to have a useful logic of `a is b or a == b`. And I argued that you could define:
def operator.identical(a, b): res = a is b or a == b assert type(res) is bool # arrays have unclear logic return res
to "bless" it as its own desired logic when dealing with containers (mainly).
Note that Python arrays define equality similarly to other containers:
py> from array import array py> array('i', [1, 2, 3]) == array('i', [2, 3, 1]) False
It is numpy arrays which do something unusual with equality. (And I would argue that they are wrong to do so. But that ship has long sailed over the horizon.)
Only `identical` is actually always allowed to use the `is` shortcut.
You can't enforce that (and why would you want to?).
If I want to use an `is` shortcut in my `__eq__` methods, or write out the condition in full, who are you to say that's forbidden unless I call `identical`?
Now, for all practical purposes "identical" is maybe already correctly defined by `a is b or bool(a == b)` (NaN being the largest inconsistency, since NaN is not a singleton). Along that line, I could argue that `PyObject_RichCompareBool` is actually incorrectly implemented and it should be replaced with a new `PyObject_Identical` in most places where it is used.
In what way is PyObject_RichCompareBool incorrect? Can you point to a bug caused by this incorrect implementation?
Once you get to the point where you accept the existance of `identical` as a distinct operation, allowing `identical(NaN, NaN)` to be always true *can* make sense
We already have `identical` in the language, it is the `is` operator. Your "identical" function is misnamed, it should be "identical_or_equal".
If you want to argue that "identical or equal" is such a fundamental and important operation in Python code that we ought to offer it ready-made in the operator module, I'm listening. But my gut feeling here is to say "not every one line expression needs to be in the stdlib".
PyObject_RichCompareBool is a different story. "Identical or equal" is not so simple to implement correctly in C code, and it is a common operation used in lists, tuples, dicts and possibly others, so it makes sense for there to be a C API for it.
and resolves current inconsistencies w.r.t. containers and NaNs.
How does it resolve these (alleged) inconsistencies?
The current status quo is that containers perform operations such as equality by testing for identity or equality, which they are permitted to do and is documented. Changing them to use your "identical or equal" API will (as far as I can see) change nothing about the semantics, behaviour or even implementation (since the C-level containers like list will surely still call PyObject_RichCompareBool rather than a Python-level wrapper).
So whatever inconsistencies exist, they will still exist.
If I have missed something, please tell me.