[Python-ideas] Consider adding clip or clamp function to math
Steven D'Aprano
steve at pearwood.info
Thu Aug 4 08:35:58 EDT 2016
On Tue, Aug 02, 2016 at 04:35:55PM -0700, Chris Barker wrote:
> If someone is passing a NaN in for a bound, then they are passing in
> garbage, essentially -- "I have no idea what my bounds are" so garbage is
> what they should get back -- "I have no idea what your clamped values are".
The IEEE 754 standard tells us what min(x, NAN) and max(x, NAN) should
be: in both cases it is x.
https://en.wikipedia.org/wiki/IEEE_754_revision#min_and_max
Quote:
In order to support operations such as windowing in which a NaN
input should be quietly replaced with one of the end points, min
and max are defined to select a number, x, in preference to a
quiet NaN:
min(x,NaN) = min(NaN,x) = x
max(x,NaN) = max(NaN,x) = x
According to Wikipedia, this behaviour was chosen specifically for
the use-case we are discussing: windowing or clamping.
See also page 9 of Professor William Kahan's notes here:
https://people.eecs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF
Quote:
For instance max{x, y} should deliver the same result as max{y, x} but
almost no implementations do that when x is NaN. There are good
reasons to define max{NaN, 5} := max{5, NaN} := 5 though many would
disagree.
It's okay to disagree and want "NAN poisoning" behaviour. If we define
clamp(x, NAN, NAN) as x, as I have been arguing, then you can *easily*
get the behaviour you want with a simple wrapper:
def clamp(x, lower, upper):
if math.isnan(lower) or math.isnan(upper):
# raise or return NAN
else:
return math.clamp(x, lower, upper)
Apart from the cost of one extra function call, which isn't too bad,
this is no more expensive than what you are suggesting *everyone* should
pay (two calls to math.isnan). So you are no worse off under my
proposal: just define your own helper function, and you get the
behaviour you want. We all win.
But if the standard clamp() function has the behaviour you want,
violating IEEE-754, then you are forcing it on *everyone*, whether they
want it or not. I don't want it, and I cannot use it. There's nothing I
can do except re-implement clamp() from scratch and ignore the one in
the math library.
As you propose it, clamp() is no use to me: it unnecesarily converts the
bounds to float, which may raise an exception. If I use it in a loop, it
unnecessarily checks to see if the bounds are NANs, over and over and
over again, even when I know that they aren't. It does the wrong thing
(according to my needs, according to Professor Kahan, and according to
the current revision of IEEE-754) if I do happen to pass a NAN as bounds.
Numpy has a "nanmin" which ignores NANs (as specified by IEEE-754), and
"amin" which propogates NANs:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.nanmin.html
http://docs.scipy.org/doc/numpy/reference/generated/numpy.amin.html
Similar for "minimum" and "fmin", which return the element-wise
minimums.
By the way, there are also POSIX functions fmin and fmax which behave
according to the standard:
http://man7.org/linux/man-pages/man3/fmin.3.html
http://man7.org/linux/man-pages/man3/fmax.3.html
Julia has a clamp() function, although unfortunately the documentation
doesn't say what the behaviour with NANs is:
http://julia.readthedocs.io/en/latest/stdlib/math/#Base.clamp
--
Steve
More information about the Python-ideas
mailing list