Re: [Python-ideas] PEP 485: A Function for testing approximate equality

27 Jan 2015

      On 27 January 2015 at 14:28, Nick Coghlan  wrote:
...
Translate that into explicit English and I'm not sure a symmetric
definition reads more clearly:
"a and b are close to each other"
"a is close to b"
"b is close to a"
However, in programming terms,

are_close(a, b)
is_close_to(a, b)
is_close_to(b, a)

the latter two have the "which is the target" issue. And yes, real
code will have more obvious argument names. It's not a huge deal, I
agree. I'm just saying that the first form takes less mental effort to
parse while reading through a block of code.

Enough said. It's not a big deal, someone (not me) ultimately needs to
make the decision. I've explained my view so I'll stop.
...
Given that the "is close to" formulation also simplifies the
calculation of a relative tolerance (it's always relative to the right
hand operand), it has quite a bit to recommend it.
Agreed. It's a trade-off, and my expectation is that most code will
simply use the defaults, so making that read better is a good choice.
If you believe that most people will explicitly set a tolerance of
some form, the asymmetric choice may well be better.
...
With an asymmetric comparison, another alternative would be to have an
explicit threshold value for the reference where it switched from
relative to absolute tolerance checking. That is:
def is_close_to(value, reference, *, error_ratio=1e-8,
near_zero_threshold=1e-6, near_zero_tolerance=1e-14):
        """Check if the given value is close to a reference value
In most cases, the two values are close if
'abs(value-reference) < reference*error_ratio'
            If abs(reference) < near_zero_threshold, or
near_zero_threshold is None,
            the values are close if 'abs(value-reference) < near_zero_tolerance'
        """
Eep. All I can say is that I never expect to write code where I'd even
consider changing the parameters as documented there. I don't think I
could understand the implications well enough to trust my judgement.
Remember, my intuition hits its limit at "within 1 millionth of a
percent of each other" (that's the 1e-8) or "numbers under 1e-6 have
to differ by no more than 1e-14" (the other two). And I'd punt on what
might happen if both conditions apply.
...
Setting near_zero_threshold to 0 would force a relative comparison
(even near zero), while setting it to None would force an absolute one
(even far away from zero).
The latter choice would make the name "near_zero_tolerance" a pretty
odd thing to see...
...
If you look at the default values, this is actually a very similar
definition to the one Chris has in PEP 485, as the default near zero
threshold is the default error ratio multiplied by the default near
zero tolerance, although I'm not sure as to the suitability of those
numbers.
The difference is that this takes the cutoff point between using a
relative error definition (to handle the dynamic range issues of a
floating point representation) and an absolute error definition (to
handle the instability of relative difference near zero) and *gives it
a name*, rather than deriving it from a confusing combination of the
reference value, the error ratio and the near zero tolerance.
...
That's it. Anyone wanting to specify both parameters together, or
wanting the defaults to still apply "as well as" an explicitly
specified tolerance, is deemed an "expert" and should be looking for a
more specialised function (or writing their own).
I believe breaking out the cutoff point as a separately named
parameter makes the algorithm easy enough to explain that restricting
it isn't necessary.
I'd love to see the proposed documentation, as I think it would
probably read as "complicated stuff, leave well alone" to most people.
But I *am* assuming a target audience that currently uses
"abs(x-y)<1e-8" [1], and unittest's assertAlmostEqual, and doesn't
think they need anything different. The rules are different if the
target audience is assumed to know more than that.

Paul

[1] Someone, it may have been Chris or it may have been someone else,
used that snippet, and I've seen 1e-8 turn up elsewhere in books on
numerical algorithms. I'm not sure why people choose 1e-8 (precision
of a C float?), and how it relates to the 1e-6, 1e-8 and 1e-14 you
chose for your definition. It feels like the new function may be a lot
stricter than the code people naively (or otherwise) write today. Is
that fair, or am I reading too much into some arbitrary numbers? (Note
- I won't understand the explanation, I'm happy with just "that's a
good point" or "no, the numbers ultimately chosen will be fine" :-))

Re: [Python-ideas] PEP 485: A Function for testing approximate equality

Paul Moore