<div dir="ltr">Sorry, <div><br></div><div>This slipped off list -- bringin it back.</div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jan 26, 2015 at 12:40 PM, Paul Moore <span dir="ltr"><<a href="mailto:p.f.moore@gmail.com" target="_blank">p.f.moore@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">> Any of the approaches on the table will do something reasonable in this<br>

> case:<br>

><br>

> In [4]: is_close_to.is_close_to(sum([0.1]*10), 1)<br>

> testing: 0.9999999999999999 1<br>

> Out[4]: True<br>

<br>

</span>Yes, but that's not my point. I was responding to Steven's comment<br>

that having 2 different types of tolerance isn't "arcana", by pointing<br>

out that I find even stuff as simple as multiplication vs cumulative<br>

addition confusing. And I should note that I was (many years ago!) a<br>

maths graduate and did some numerical maths courses, so this stuff<br>

isn't completely unknown to me.</blockquote><div><br></div><div>Right it can be arcane -- which is why I want this function, and why we want it to do something "sane" most of the time, be default.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">> Note that the 1e-8 default I chose (which I am not committed to) is not<br>

> ENTIRELY arbitrary -- it's about half the digits carried by a python float<br>

> (double) -- essentially saying the values are close to about half of the<br>

> precision available. And we are constrained here, the options are between<br>

> 0.1 (which would be crazy, if you ask me!) and 1e-14 -- any larger an it<br>

> would meaningless, and any smaller, and it would surpass the precision of a<br>

> python float. PIcking a default near the middle of that range seems quite<br>

> sane to me.<br>

<br>

</span>Sorry, that means nothing to me. Head exploding time again :-)</blockquote><div><br></div><div>Darn -- I'll try again -- with a relative tolerence, two values are only going to be close if their exponent is within one of each-other. So what you are setting is how many digits of the mantisa you care about. a toleranc eof 0.1 would be about one digit, and a tolerance of 1e-15 would be 15 digits. Python floats carry about 15 digits -- so the relative tolerance has to be betwwen 1e-1 and 1e-15 -- nothign else is useful or makes sense. So I put it in the middle: 1e-8</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">

> This is quite different than setting a value for an absolute tolerance --<br>

> saying something is close to another number if the difference is less than<br>

> 1e-8 would be wildly inappropriate when the smallest numbers a float can<br>

> hold are on order of 1e-300!<br>

<br>

</span>On the other hand, I find this completely obvious. (Well, mostly -<br>

don't the gaps between the representable floats increase as the<br>

magnitude gets bigger, so an absolute tolerance of 1e-8 might be<br>

entirely reasonable when the numbers are sufficiently high? </blockquote><div><br></div><div>sure it would -- that's the point -- what makes sense as an absolute tolerance depends entirely on the magnitude of the numbers -- since we don't know the magnitude of the numbers someone may use, we can't set a reasonable default.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">> arcana, maybe, not it's not a floating point issue -- X% of zero is zero<br>

> absolutely precisely.<br>

<br>

</span>But the "arcana" I was talking about is that a relative error of X%<br>

could be X% of the value under test, of the expected value, of their<br>

average, or something else.</blockquote><div><br></div><div>Ahh! -- which is exactly the point I think some of us are making -- defining X% error relative to the "expected" value is the simplest and most straightforward to explain. That's the primary reason I prefer it.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">And only *one* of those values is zero, so<br>

whether X% is a useful value is entirely dependent on the definition.<br></blockquote><div><br></div><div>not sure what you meant here, but actually relative error goes to heck if either value is zero, and with any of the definitions we are working with. So X% is useful for any value except if one of the values is zero.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

And how relative errors are defined *is* floating point arcana (I can<br>

picture the text book page now, and it wasn't simple...)<br></blockquote><div><br></div><div>semantics here -- defining a realtive error can be done with pure real numbers -- computing it can get complex with floating point.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">

> But back to a point made earlier -- the idea here is to provide something<br>

> better than naive use of<br>

><br>

> x == y<br></span></blockquote><div><snip> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

I still wonder whether "naive use of equality" is much of a target,<br>

though. There are only two use cases that have been mentioned so far.<br>

Testing is not about equality, because we're replacing<br>

assertAlmostEqual. And when someone is doing an iterative algorithm,<br>

they are looking for convergence, i.e. within a given range of the<br>

answer. So neither use case would be using an equality test anyway.<br></blockquote><div><br></div><div>well, the secondary target is a better (or more flexible) assertAlmostEqual. It is not suitable for arbitrarily large or small numbers, and particularly not numbers with a range of magnitudes -- a relative difference test is much needed.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I'm not sure I follow your point, but I will say that if Nathaniel has<br>

seen a lot of use cases for assertAlmostEqual that can't be easily<br>

handled with the new function, then something is badly wrong. </blockquote><div><br></div><div>Well, I"m not suggesting that we replace assertAlmostEqual -- but rather augment it. IN fact, assertAlmostEqual is actually a an absolute tolerance test (expressed in terms f decimal places). That is the right thing, and the only right thing to use when you want to compare to zero.</div><div><br></div><div>What I'm proposing a relative tolerance test, which is not the right thing to use for comparing to zero, but is the right thing to use when comparing numbers of varying magnitude.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">There<br>

aren't enough good use cases that we can reasonably decide to reject<br>

any of them as out of scope,</blockquote><div><br></div><div>I've lost track of what we might be rejecting.</div><div><br></div><div>The whole symmetric -- non symmetric argument really is bike shedding -- in the context of "better than ==" or "different but as good as assetAlmostEqual" -- any of them are just fine.</div><div><br></div><div>so really all wer aare left with is defaults -- also bike-shedding, except for the default for the zero test, and there are really two options there:</div><div><br></div><div>use 0.0 for abs_tolerance, and have it fail for any test against zero unless the user specifies something approporate for their use case.</div><div><br></div><div>or</div><div><br></div><div>use a SOME default, either for abs_tolerance or zero_tolerance, and make an assumption about the ofder of magnitide of the lielky results, so that it will "jsut work" for tests against zero. Maybe something small relative to one (like 1e-8) would be OK, but that concerns me -- I think you'd get false positives for small numbers which is worse that false negatives for all comparisons to zero.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">> 1e-8 -- but you already know that ;-) -- anything between 1e-8 and 1e-12<br>

> would be fine with me.<br>

<br>

</span>TBH all I care about in this context is that there must be 2 values x<br>

and y for which is_close(x,y) == True and x != y.</blockquote><div><br></div><div>everything on the table will do that.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> I'm tempted to<br>

strengthen that to "for all y there must be at least 1 x such that..."<br>

but it's possible that's *too* strong and it can't be achieved.<br>

</blockquote><div><br></div><div>I think we could say that for all y except 0.0 -- and even zero if an abs_tolerance is greater than zero is set.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Basically, the default behaviour needs to encompass what I believe is<br>

most people's intuition - that "close" is a proper superset of<br>

"equal".<br></blockquote><div><br></div><div>A good reason not to have all defaults be zero -- I don't think we need a function that doesn't work at all with default values.</div><div> </div><div>-Chris</div><div><br></div></div><div><br></div>-- <br><div class="gmail_signature"><br>Christopher Barker, Ph.D.<br>Oceanographer<br><br>Emergency Response Division<br>NOAA/NOS/OR&R            (206) 526-6959   voice<br>7600 Sand Point Way NE   (206) 526-6329   fax<br>Seattle, WA  98115       (206) 526-6317   main reception<br><br><a href="mailto:Chris.Barker@noaa.gov" target="_blank">Chris.Barker@noaa.gov</a></div>

</div></div>