Good question!
I think this started with a valuable optimization for `x in <list>`. I don't know if that was ever carefully documented, but I remember that it was discussed a few times (and IIRC Raymond was adamant that this should be so optimized -- which is reasonable).
I'm tempted to declare this implementation-defined behavior -- *implicit* calls to __eq__ and __ne__ *may* be skipped if both sides are the same object depending on the whim of the implementation.
We should probably also strongly recommend that __eq__ and __ne__ not do what math.nan does.
However we cannot stop rich compare __eq__ implementations that return arrays of pairwise comparisons, since numpy does this. (And yes, it seems that this means that `x in y` is computed incorrectly if x is an array with such an __eq__ implementation and y is a tuple of such objects. I'm sure there's a big warning somewhere in the numpy docs about this, and I presume if y is a numpy array they make sure to do something better.)