[Python-ideas] Consider adding clip or clamp function to math

Tue Aug 2 14:22:04 EDT 2016

On Tue, Aug 02, 2016 at 10:02:11AM -0700, Chris Barker wrote:

> I don't think IEE754 says anything about a "clip" function, but a NaN is
> neither greater than, less than, nor equal to any value -- so when  you ask
> if, for example, for the input value if it is less than or equal to NaN,

Then the answer MUST be False. That's specified by IEEE-754.

> but NaN if NaN is great then the input, there is no answer -- the spirit of
> IEEE NaN handling leads to NaN being the only correct result.

Incorrect. The IEEE standard actually does specify the behaviour of 
comparisons with NANs, and Python does it correctly. See also the 
Decimal module.

> Note that I'm pretty sure that min() and max() are wrong here, too.

In a later update to the standard, IEEE-854 if I remember correctly, 
there's a whole series of extra comparisons which will return NAN given 
a NAN argument, including alternate versions of max() and min(). I can't 
remember which is in 754 and which in 854, but there are two versions of 
each:

min #1 (x, NAN) must return x
min #2 (x, NAN) must return NAN

and same for max.

In any case, clamping is based of < and > comparisons, which are 
well-specified by IEEE 754 even when NANs are included:

# pseudo-code
for op in ( < <= == >= > ):
    assert all(x op NAN is False for all x)

assert all(x != NAN is True for all x)

If you want the comparisons to return NANs, you're looking at different 
comparisons from a different standard.

> > That's why I said that it was an accident of implementation that passing
> > a NAN as one of the lower or upper bounds will be equivalent to setting
> > the bounds to minus/plus infinity:
> 
> exactly -- and we should not have the results be an accident of
> implimentation -- but rather be thougth out, and follow IEE754 intent.

There are lots of places in Python where the behaviour is an accident of 
implementation. I don't think that this clamp() function should convert 
the arguments to floats (which may fail, or lose precision) just to 
prevent the caller passing a NAN as one of the bounds. Just document the 
fact that you shouldn't use NANs as lower/upper bounds.

> why not say that passing NaNs as bounds will result in NaN result?

Because that means that EVERY call to clamp() has to convert both bounds 
to float and see if they are NANs. If you're calling this in a loop:

for x in range(1000):
    print(clamp(x, lower, upper))

each bound gets converted to float and checked for NAN-ness 1000 times. 
This is a total waste of effort for 99.999% of uses, where the bounds 
will be numbers.

> At least
> if the value is a float -- if it's anything else than maybe an exception,
> as NaN does not make sense for anything else anyway.

Of course it does: clamp() can change the result type, so it could 
return a NAN. But why would you bother?

clamp(Fraction(1, 2), 0.75, 100) returns 0.75;
clamp(100, 0.0, 50.0) returns 50.0;

> > If you
> > want to specify an unbounded limit, pass None or an infinity with the
> > right sign.
> 
> exactly -- that's there, so why not let NaN be NaN?

Because it is unnecessary.

If you want a NAN-enforcing version of clamp(), it is *easy* to write a 
wrapper:

def clamp_nan(value, lower, upper):
    if math.isnan(lower) or math.isnan(upper):
        return float('nan')
    return clamp(value, lower, upper)

A nice, easy four-line function. But if clamp() does that check, it's 
hard to avoid the checks when you don't want them. I know my bounds 
aren't NANs, and I'm calling clamp() in big loop. Don't check them a 
million times, they're never going to be NANs, just do the comparisons.

It's easy to write a stricter function if you need it. It's hard to 
write a less strict function when you don't want the strictness.

-- 
Steve