
On Wed, 24 Mar 2010 08:51:36 pm Mark Dickinson wrote:
On Wed, Mar 24, 2010 at 5:36 AM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
Steven D'Aprano writes:
> As usual though, NANs are unintuitive: > > >>> d = {float('nan'): 1} > >>> d[float('nan')] = 2 > >>> d > {nan: 1, nan: 2} > > > I suspect that's a feature, not a bug.
Right: distinct nans (i.e., those with different id()) are treated as distinct set elements or dict keys.
I don't see how it can be so. Aren't all of those entries garbage? To compute a histogram of results for computations on a series of cases would you not have to test each result for NaN-hood, then hash on a proxy such as the string "Nan"?
Not necessarily -- you could merely ignore any key which is a NaN, or you could pass each key through this first: def intern_nan(x, nan=float('nan')): if math.isnan(x): return nan return x thus ensuring that all NaN keys were the same NaN.
So what alternative behaviour would you suggest, and how would you implement it? [...] One alternative would be to prohibit putting nans into sets and dicts by making them unhashable; I'm not sure what that would gain, though. And there would still be some unintuitive behaviour for containment testing of nans in lists.
I think that would be worse than the current situation. That would mean that dict[some_float] would *nearly always* succeed, but occasionally would fail. I can't see that being a good thing. -- Steven D'Aprano