On Tue, Aug 02, 2016 at 10:02:11AM -0700, Chris Barker wrote:
I don't think IEE754 says anything about a "clip" function, but a NaN is neither greater than, less than, nor equal to any value -- so when you ask if, for example, for the input value if it is less than or equal to NaN,
Then the answer MUST be False. That's specified by IEEE-754.
but NaN if NaN is great then the input, there is no answer -- the spirit of IEEE NaN handling leads to NaN being the only correct result.
Incorrect. The IEEE standard actually does specify the behaviour of comparisons with NANs, and Python does it correctly. See also the Decimal module.
Note that I'm pretty sure that min() and max() are wrong here, too.
In a later update to the standard, IEEE-854 if I remember correctly, there's a whole series of extra comparisons which will return NAN given a NAN argument, including alternate versions of max() and min(). I can't remember which is in 754 and which in 854, but there are two versions of each:
min #1 (x, NAN) must return x min #2 (x, NAN) must return NAN
and same for max.
In any case, clamping is based of < and > comparisons, which are well-specified by IEEE 754 even when NANs are included:
# pseudo-code for op in ( < <= == >= > ): assert all(x op NAN is False for all x)
assert all(x != NAN is True for all x)
If you want the comparisons to return NANs, you're looking at different comparisons from a different standard.
That's why I said that it was an accident of implementation that passing a NAN as one of the lower or upper bounds will be equivalent to setting the bounds to minus/plus infinity:
exactly -- and we should not have the results be an accident of implimentation -- but rather be thougth out, and follow IEE754 intent.
There are lots of places in Python where the behaviour is an accident of implementation. I don't think that this clamp() function should convert the arguments to floats (which may fail, or lose precision) just to prevent the caller passing a NAN as one of the bounds. Just document the fact that you shouldn't use NANs as lower/upper bounds.
why not say that passing NaNs as bounds will result in NaN result?
Because that means that EVERY call to clamp() has to convert both bounds to float and see if they are NANs. If you're calling this in a loop:
for x in range(1000): print(clamp(x, lower, upper))
each bound gets converted to float and checked for NAN-ness 1000 times. This is a total waste of effort for 99.999% of uses, where the bounds will be numbers.
At least if the value is a float -- if it's anything else than maybe an exception, as NaN does not make sense for anything else anyway.
Of course it does: clamp() can change the result type, so it could return a NAN. But why would you bother?
clamp(Fraction(1, 2), 0.75, 100) returns 0.75; clamp(100, 0.0, 50.0) returns 50.0;
If you want to specify an unbounded limit, pass None or an infinity with the right sign.
exactly -- that's there, so why not let NaN be NaN?
Because it is unnecessary.
If you want a NAN-enforcing version of clamp(), it is *easy* to write a wrapper:
def clamp_nan(value, lower, upper): if math.isnan(lower) or math.isnan(upper): return float('nan') return clamp(value, lower, upper)
A nice, easy four-line function. But if clamp() does that check, it's hard to avoid the checks when you don't want them. I know my bounds aren't NANs, and I'm calling clamp() in big loop. Don't check them a million times, they're never going to be NANs, just do the comparisons.
It's easy to write a stricter function if you need it. It's hard to write a less strict function when you don't want the strictness.