Way to check for floating point "closeness"?

Now that we're talking about floating point conveniences (math.nan, linspace): What about putting an almost_equal(x,y,tol=1e14) (or close(), or...) in the math module. AFAICT, in the standard library, there is only: unittest.TestCase.assertAlmostEqual but it: A) is buried in the unittest.TestCase class B) is an assertion, so you can't use it as a general test (easily) C) uses number of decimal digits or an absolute delta, but does not provide a significant figures comparison, which is most likely what's wanted (and a bit harder to write yourself) numpy provides allclose() (and isclose() ), which is pretty much what I'm suggesting. Anyone else think this would be a good idea to add to the stdlib? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Mon, 12 Jan 2015 12:04:47 -0800 Guido van Rossum <guido@python.org> wrote:
So is 1e-100 close to 1e-50 with the default tolerance? What about 1e+100 and (1e+100 plus one ulp)?
Indeed, there are different ways to express such a requirement. Numpy uses a combination of relative and absolute difference (see http://docs.scipy.org/doc/numpy/reference/generated/numpy.isclose.html#numpy...), but you may also want a maximum number of ulps differing. You also need special code for infinities and the like. (and in some cases you may want to distinguish positive and negative zeros, even though they are infinitely close :-)) Regards Antoine.

On Mon, Jan 12, 2015 at 12:04 PM, Guido van Rossum <guido@python.org> wrote:
So is 1e-100 close to 1e-50 with the default tolerance? What about 1e+100 and (1e+100 plus one ulp)?
sorry I should have not specified a specific default, and/or explained more what I think it should mean. But I thought the first question was "might this be a good idea for the standard library" -- and only if so, then we can work out how to do it. But anyway, the while point is that it would be some version "relative error", NOT an absolute tolerance -- that is pretty easy to write. See Nathaniel's note for the too many options already in numpy. Nathaniel Smith wrote: Unfortunately this opens a tremendous can of worms. Well, yes, but something generally useful for many cases would still be nice. Boost has thought about this a lot and advocates a slightly different
It looks like they start with two, but end up with "the implementation is using modified version of the equations (1) and (2) where all underflow, overflow conditions could be guarded safely" That looks good to me, and the origins in Knuth are a good sign. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Mon, Jan 12, 2015 at 5:02 PM, Chris Barker <chris.barker@noaa.gov> wrote:
Unfortunately this opens a tremendous can of worms. Numpy actually provides: allclose/assert_allclose -- absolute + relative tolerance, not symmetric assert_approx_equal -- agreement to n digits (don't use this) assert_array_max_ulp, assert_array_almost_equal_nulp -- tolerance based on counting ulps. Something like a relative tolerance of n*2^-53 for doubles, but it depends on datatype bits and acts differently near zero. I have no idea what the difference between these two functions is. I've recently argued that allclose should actually act slightly differently: http://mail.scipy.org/pipermail/numpy-discussion/2014-July/070639.html Boost has thought about this a lot and advocates a slightly different definition (actually, two slightly different definitions) from any of the above: http://www.boost.org/doc/libs/1_34_0/libs/test/doc/components/test_tools/flo... -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org

On Mon, Jan 12, 2015 at 09:02:17AM -0800, Chris Barker wrote:
I do, and I have already done so! It's an implementation detail of the statistics module (to be specific, its test suite), but it covers both relative and absolute error tolerances and handles infinities and NANs. https://hg.python.org/cpython/file/1b145e8ae4be/Lib/test/test_statistics.py#... The default tolerances are more or less plucked out of thin air and probably should be discussed. Ideally it should also handle ULP comparisons: https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-number... Unfortunately a naive ULP comparison has trouble with NANs, INFs, and numbers close to zero, especially if they have opposite signs. The smallest representable denormalised floats larger, and smaller, than zero are: 5e-324 -5e-324 These are the smallest magnitude floats apart from zero, so we might hope that they are considered "close together", but they actually differ by 9223372036854775808 ULP. Ouch. I have some ideas for dealing with that, and if anyone is interested I'm happy to talk about it, but they're not ready for production yet. I think that the Bruce Dawson is right. Floating point comparisons are hard, really hard. I know that I've still got a lot to learn about it. I can think of at least five different ways to compare floats for equality, and they all have their uses: - exact equality using == - absolute error tolerances - relative error tolerances - ULP comparisons - the method unittest uses, using round() I'm explicitly including == because it is a floating point superstition that one should never under any circumstances compare floats for exact equality. As general advice, "don't use == unless you know what you are doing" is quite reasonable, but it's the "never use" that turns it into superstition. As Bruce Dawson says, "Floating-point numbers aren’t cursed", and throwing epsilons into a problem where no epsilon is needed is a bad idea. https://randomascii.wordpress.com/2012/06/26/doubles-are-not-floats-so-dont-... Aside: I'm reminded of APL, which mandates fuzzy equality (i.e. with a tolerance) of floating point numbers: In an early talk Ken [Iverson] was explaining the advantages of tolerant comparison. A member of the audience asked incredulously, “Surely you don’t mean that when A=B and B=C, A may not equal C?” Without skipping a beat, Ken replied, “Any carpenter knows that!” and went on to the next question. - Paul Berry -- Steve

On Tue, Jan 13, 2015 at 1:34 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Only with a naive (i.e., wrong :-) implementation. Those two floats differ by precisely 2 units in the last place, and any correct implementation should report that. It's not hard to write code that deals correctly with opposite signs. Here's a simple difference_in_ulps function that correctly reports the number of ulps difference between any two finite floats.
import struct
def to_ulps(x):
... n = struct.unpack('<q', struct.pack('<d', x))[0] ... return -(n + 2**63) if n < 0 else n ...
def difference_in_ulps(x, y):
... return abs(to_ulps(x) - to_ulps(y)) ...
difference_in_ulps(-5e-324, 5e-324)
2 This is almost exactly what's in Lib/test/test_math.py already, except that the function there is better documented and uses "~(n + 2**63)" instead of "-(n + 2**63)" in the negative n correction branch, which has the effect of regarding 0.0 and -0.0 as 1 ulp apart. Comparing by ulps was what I needed for testing library-quality functions for the math and cmath modules; I doubt that it's what's needed for most comparison tasks. I'd expect the suggested combination of relative error and absolute error to be more appropriate most of the time. -- Mark

On Tue, Jan 13, 2015 at 09:28:39AM +0000, Mark Dickinson wrote:
Well, I said I was still learning :-) In my defence, I was influenced by Bruce Dawson who said you can't do ulps comparisons if the signs are not the same.
How do you deal with cases where one number is normalised and the other is denormalised? E.g. I think most people would consider these two numbers to be close together: py> to_ulps(2.2250738585072014e-308) - to_ulps(1.1125369292536007e-308) 2251799813685248 -- Steve

On Jan 13, 2015, at 1:29 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
Comparing by ulps was what I needed for testing library-quality functions for the math and cmath modules; I doubt that it's what's needed for most comparison tasks.
That's the conclusion I was coming to. Ulps are likely to be the right way to do it if your trying to understand/test the accuracy of an algorithm, but not for general "did I get a close enough result". And it would be a lot harder to understand for most of us. As for comparing to zero -- in reading about this, it seems there simply is no general solution -- only the user knows what they want. So the only thing to do is a big warning in the docs about it, and providing an absolute tolerance option. Should that a separate function or a flag? This is actually a harder problem for numpy, as it's an array function, so you need to have the same function/parameters for every value in the array, some of which may be near zero. I haven't thought it out yet, but maybe we could specify an absolute tolerance near zero, and a relative tolerance elsewhere, both at once. Tricky to document, even if possible.
I'd expect the suggested combination of relative error and absolute error to be more appropriate most of the time.
And most of the time is what we are going for. -Chris

On 01/13/2015 09:53 AM, Chris Barker - NOAA Federal wrote:
Doesn't this problem come up at any boundary comparison, and not just zero? So isn't the issue about any n distance from any floating point number that is less than 1 ulp? And in that regard, comparison to zero is no different than any comparison to any other floating point value? Just trying to follow along, -Ron
I'd expect the suggested combination of relative error and absolute error to be more appropriate most of the time. And most of the time is what we are going for.

Zero is special because you lose the ability to use a relative tolerance. Everything is huge compared to zero.
So isn't the issue about any n distance from any floating point number that is less than 1 ulp?
I'm still a bit fuzzy on Ulps, but it seems the goal here is to define a tolerance larger than an ulp. This is for the use case where we expect multiple rounding errors -- many more than one ulp, That's why I think the use case for ulp comparisons is more about assessment of accuracy of algorithms than "did I introduce a big old bug?" or, "is this computed value close enough to what I measured?" -Chris

On 01/13/2015 09:25 PM, Chris Barker - NOAA Federal wrote:
On Jan 13, 2015, at 6:58 PM, Ron Adam<ron3200@gmail.com> wrote:
Doesn't this problem come up at any boundary comparison, and not just zero?
Zero is special because you lose the ability to use a relative tolerance. Everything is huge compared to zero.
After I posted I realised that when you compare anything you subtract what you are comparing to, and if it's equal to zero, then it's equal to what you are comparing to. So testing against zero is fundamental to all comparisons, is this is correct? Wouldn't a relative tolerance be set relative to some value that is *not* zero? And then used when comparing any close values, including zero.
I haven't looked into the finer aspects of ulps myself. It seems to me ulps only matter if the exponent part of two floating point numbers are equal and the value part is within 1 (or a small few) ulps of each other. Then there may be problems determining if they are equal, or one is greater or less than the other. And only if there is no greater tolerance value set. A couple of thoughts come to mind. I'm not sure if they are relevant though. I think for math algorithms that are hard coded in a program, it isn't much of an issue as the author would have a feel for the size of a delta if needed, and output of an appropriate number of significant digits. and can calculate the error range if needed as well. That probably fits most situations and is what is typically done. It seems to me, that automated tracking and/or use of these things may be wanted with equation solvers. The solver would determine the delta and significant digits from it's initial data. That sounds like it could get very complex, but maybe making this easier to do is the point? Cheers, Ron

On Jan 13, 2015, at 21:12, Ron Adam <ron3200@gmail.com> wrote:
I think you're missing the meanings of these terms. Absolute tolerance means a fixed tolerance, like 1e-5; relative tolerance means you pick a tolerance that's relative to the values being compared--think of it as a percentage, say, 0.01%. Plenty of values are within +/- 1e-5 of 0. But the only value within +/- 0.01% of 0 is 0 itself. Another way to look at it: x is close to y with absolute tolerance 1e-5 if abs(x-y) < 1e-5. x is close to y with relative tolerance 1e-5 if abs(x-y)/y < 1e-5. So, no value is ever close to 0 within any relative tolerance.
No, this is wrong. First, 1.1e1 and 1.0e10 (binary) are only off by 1 ulp even though they have different exponents. Second, there's never any problem determining if two finite numbers are equal or which one is greater. The issue is determining whether two numbers +/- their error bars are too close to call vs. unambiguously greater or lesser. For example, if I get 1.0e1 and 1.1e1 (assuming 1 bit mantissa for ease of discussion), the latter is clearly greater--but if I have 2 ulp of error, or an absolute error of 1e1, or a relative error of 50%, the fact that the latter is greater is irrelevant--each value is within the other value's error range. But if I get 1.0e1 and 1.0e10 with the same error, then I can say that the latter is unambiguously greater. I think Chris is right that ulp comparisons usually only come up in testing an algorithm. You have to actually do an error analysis, and you have to have input data with a precision specified in ulp, or you're not going to get a tolerance in ulp. When you want to verify that you've correctly implemented an algorithm that guarantees to multiply input ulp by no more than 4, you can feed in numbers with +/- 1 ulp error (just typing in decimal numbers does that) and verify that the results are within +/- 4 ulp. But when you have real data, or inherent rounding issues, or an error analysis that's partly made up of rules of thumb and winging it, you're almost always going to end up with absolute or relative error instead. (Or, occasionally, something more complicated that you have to code up manually, like logarithmic relative error.)

On 01/14/2015 12:10 AM, Andrew Barnert wrote:
I see your point about the exponent. To be clear, Are we referring to significant digits here, or ... 1.0000...1e1 and 1.0000...0e10 Where the ... is a lot of zero's to the max digits pythons floating point can handle on what ever platform it's running on. What I've read indicates ULP usually refers to the limits of the implementation/device. Significant digits has more to do with error of measurements, (and estimates), while ULPs is about accuracy limits of the hardware/software ability to calculate.
You would use the larger of the three. And possibly give a warning if the 2 ulp error is the largest. (if the application is set to do so.) I presuming the 2 ulp is twice the limit of the floating point precision here. 50% accuracy of data 1e1 limit of significant digits/measurement 2 ulp twice floating point unit of least precision
Yes, This was the point I was alluding to earlier.
If the algorithm doesn't track error accumulation, then yes. This is interesting but I'm going to search for some examples of how to use some of this. I'm not sure I can add to the conversation much, but thanks for taking the time to explain some of it. Cheers, Ron

On Jan 14, 2015, at 1:08, Ron Adam <ron3200@gmail.com> wrote:
No, those are pretty far apart. We're referring to 1.1111…1e1 and 1.0000…e10 I think your confusion here is entirely my fault. For simplicity, it's often helpful to look at tiny float representations--e.g., a 4-bit float with 1 sign bit, 1 mantissa bit, and 2 exponent bits (that, if both 1, mean inf/nan), because writing 51 0's tends to obscure what you're looking at. But it's pretty stupid to do that without mentioning that you're doing so, or how it extends to larger representations if non-obvious, and I think I was just that stupid.
What I've read indicates ULP usually refers to the limits of the implementation/device.
Yes, it means "Unit of Least Precision" or "Unit in Least Place". There are a few ways to define this, but one definition is: Ignoring zeroes and denormals, two numbers are 1 ulp apart if they're finite and have the same sign and either (a) they have the same exponent and a mantissa that differs by one, or (b) they have an exponent that differs by one, the smaller one has the max mantissa and the larger the min mantissa. For zeroes, you can either define pos and neg zero as 1 ulp from each other and from the smallest denormal of the same sign, or as 0 ulp from each other and both 1 ulp from the smallest denormal of either sign.
Significant digits has more to do with error of measurements, (and estimates), while ULPs is about accuracy limits of the hardware/software ability to calculate.
And, importantly, to represent your values in the first place. If you have a value that's, say, exactly 0.3 (or, for that matter, 0.3 to 28 significant digits), +/- 1 ulp is larger than your measurement error, but it's the minimum error range you can store.l
You use whichever is/are relevant to your error analysis and ignore the others. (Technically I guess you could say you're just using 0 as the error for the two you don't care about, and then it's guaranteed that the one you do care about is largest, but I don't think that's the way you'd normally think about it.) Also, you have to be careful about how that extends to inequality.
I presuming the 2 ulp is twice the limit of the floating point precision here.
Yes, 2 ulp means off by 2 units of least precision. Of course for binary, that actually means off by 1 in the unit of penultimate precision.

On 01/14/2015 04:18 PM, Steven D'Aprano wrote:
Speaking of which: https://pypi.python.org/pypi/bigfloat/0.3.0 I know it's /called/ bigfloat, but it /looks/ like you can specify however many/few bits you want. ;) -- ~Ethan~

On Wednesday, January 14, 2015 4:18 PM, Steven D'Aprano <steve@pearwood.info> wrote:
I think a 4-bit float is a bit *too* mini, 8-bit sounds about right.
Anything up to 5 or 6 bits, you can fit a table of all the values on your monitor at decent font size. 8 bits is a bit too much for that. Which reminds me, I found an old blog post where I used such a 6-bit (1+3+2) format to demonstrate all the details of IEEE binary floats, including exactly such a table. It's now back online (http://stupidpythonideas.blogspot.com/2015/01/ieee-floats-and-python.html?vi...) if anyone's interested. I had a Binary6 class to go with it—which didn't have the complete float interface, but did have a lot of handy optionals like as_tuple and next_plus, as Decimal does—but I can no longer find that, so I stripped out that section of the blog post. But it still shows how to fiddle with the bits manually using struct plus integer operations or a library like bitstring.
(I know, I know, it should be a third-party library, not a built-in :-)
I don't think it would be very useful as a third-party lib either. But an arbitrary-precision IEEE 754-2008 Binary type, with all the optional features that aren't part of float, akin to decimal.Decimal—that might be useful. And while you're at it (you were volunteering, right?), want to go over 754-2008 vs. 854-1987 and see if there's anything worth adding to Decimal? :)

On Wed, Jan 14, 2015 at 1:08 AM, Ron Adam <ron3200@gmail.com> wrote:
Significant digits has more to do with error of measurements, (and estimates),
right -- and relative tolerance is kinda-sorta like significant digits. i.e a relative tolerance of 1e-4 is like saying the same to 4 (decimal) significant digits. which brings up the issue with 0.0 -- how many significant digits does 0.0 have? does 0.000001234 have the same significant digits as 0.0 ? -- its not really defined. whereas it's fairly straightforward to say that: 0.000001234 and 0.000001233 are the same to 3 significant digits, but differ in the forth. and note that: In [46]: u, v = 0.000001234, 0.000001233 In [47]: err = abs(u-v) In [48]: err Out[48]: 9.999999999998634e-10 so the absolute error is less than 1e-9 == pretty "small" -- but is that what we generally want? no. In [49]: err <= 1e-3*abs(u) Out[49]: True but the error is less than a relative tolerance of 1e-3 (three sig figs) In [50]: err <= 1e-4*abs(v) Out[50]: False and greater than a relative tolerance of 1e-4 (not four sig figs) Again, this all works great when you are away from zero (and maybe need to handle NaN and inf carefully too), but if we simply put this in the stdlib, then folks may use it with a zero value, and not get what they expect. My thinking now: set a "zero_tolerance",which will default to the relative tolerance, but be user-settable. If one of the input values is zero, then use the zero_tolerance as an absolute tolerance, if not, then use a relative tolerance. I think this would make for fewer surprises, and make it easier to use the same function call for a wide range of values, some of which may be zero. What I haven't figure out yet is how(or if) to make sure that the transition is continuous -- do we only use zero_tol if one of the values is exactly zero? of if one or both of the values is less than zero_tol? It seems that if you say, for instance that: 1e-12 is "close" to zero, then 1e-12 should also be "close" to any value less than 1e-12. But if 1e-12 is "close" to 1e-14 (for example), then 1e-12 should probably be "close" to 1.00000000001 also, but it wouldn't be, if we did an abrupt change to relative tolerance for any value >= the zero_tolerance. So more to think out here -- feel free to chime in. I've been playing with this gist, though it's only a few real lines of code anyway: https://gist.github.com/PythonCHB/6e9ef7732a9074d9337a
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 01/14/2015 10:44 AM, Chris Barker wrote:
It has as many significant places as your data is measured or estimated to. If your data is entered with four places, then 0.0 is actually 0.0000. And a value displayed as 1.1 that was entered as 1.1000 would still have 4 significant places.
does 0.000001234 have the same significant digits as 0.0 ? -- its not really defined.
Well, if the significant digits/places was 4 places, those would be considered equal. Where 0,000001234 is a value after a calculation is done. And the 0,0 is actually 0.0000. If the significant places used as actually 9, then the 0,0 would be 0,000000000. Python (and most languages) truncate the zero's to the right. So you can't go by what is displayed, it needs to be explicitly set. Then you would use that to determine the error range of two numbers, rounding to significant digits and calculating your +- error. Percentage error ranges are something completely different. You need to calculate that separately. And set it based on your data. And also take into account significant digits. And taking into account ULPs is another thing. If ULPs become larger than significant digits or a relative error, then maybe a warning (or error) needs to be issued. Then you can use an alternative algorithm, or modify the algorithm to avoid the warning or error, and get a valid result. And possibly python should do none of these by default so they are always explicitly ask for behaviours.
I think it depends if you are trying to measure the error in your calculation, the error of your answer based on your data accuracy, or the error of the floating point implementation. Each is a different problem. A calculation may use constants that are less precise than what is possible by floating point, but more precise than the data. So in this case, it has a limit above what is introduced by ULPs.
If python worked more like a calculator and you actually set the significant figures, this would be relevant. But since it doesn't, you need to track significant figures separately and use that value, not just use the represented value which may have truncated zero's. Then zero is not different than other numbers.
What do you think of having a calculator mode or module, which would display specified significant digits in the output, but keep the entire number internally? import calc calc.fix(4) calc.print(epxr) And possibly have other general features that can be useful in other more specialised modules.
On the parts I think I know.. OK. ;-)
I'll take a look later today. Cheers, Ron

On 01/14/2015 12:04 PM, Ron Adam wrote:
If 0.000001234 has four significant digits, those digits are the ending 1234, not the first four zeros. The leading zeros would only be significant if there was a leadinger non-zero ;) -- and then the zeros would no longer be leading. -- ~Ethan~

On Jan 14, 2015, at 12:04, Ron Adam <ron3200@gmail.com> wrote:
First, why use a "calc" mode instead of just format('.4')? Also, this only works when the number of significant digits is small enough to fit in 52 bits; beyond that, it's misleading. Most calculators do something like show 8 decimal digits and store 10 decimal digits, which doesn't have this problem. And in fact, I think in any case where format isn't sufficient, the decimal module is what you need. (In fact, even with 4 digits it's misleading; two different values can both print identically but there's no function to check that except to stringify them and compare the strings. Calculators that display 8 digits but use 10 for intermediate results usually do comparisons on the 8 digits.) At any rate, switching to decimal makes it easier to think about and see the rounding errors, and eliminates any additional errors caused by going back and forth to decimal strings (and notice inputs are often decimal strings, with precision represented in decimal significant figures). But ultimately you still have the same problems of tracking the accumulation of error and determining the appropriate relative or absolute (or ulp or log2-relative or whatever) tolerance for equality and inequality comparisons.

On Tue, Jan 13, 2015 at 08:57:38PM -0600, Ron Adam wrote:
On 01/13/2015 09:53 AM, Chris Barker - NOAA Federal wrote:
[...]
No. A quick refresher on error tolerances... Suppose you have a value which should be exactly 0.5, but you calculate it as 0.51. Then the absolute error is 0.51-0.5 = 0.01, and the relative error is 0.01/0.5 = 0.02 (or 2%). But consider two values, 0.0 and 0.1. Then the absolute error is 0.1-0.0 = 0.1, and the relative error is 0.1/0.0 which is infinite. So 0.0 and -0.0 are problematic when dealing with relative errors.
No. 1 ULP (Unit In Last Place) is the smallest possible difference between two floats. A difference of 0 ULP means the two floats are exactly equal. How it works: in mathematics, real numbers are continuous, but floats are not. There are only 2**64 floats in Python (a C double), less if you ignore the NANs and INFs, which means we can conveniently enumerate them from -(2**64) to (2**64-1), based on the internal structure of a float. So if you convert two floats into this enumerated integer value (which is equivalent to doing a type-cast from a C double to a C long) and subtract the two ints, this gives you a measure of how far apart they are. (As Mark mentioned earlier, you have to make allowance for negative floats, also INF and NANs are problematic too.) If two values are exactly equal, their "distance apart" in ULP will be zero. A distance of 1 ULP means they are consecutive floats, they cannot possibly be any closer without being equal. A distance of 2 ULP means there is only a single float separating them, and so on. Note that ULP do not directly correspond to a numeric tolerance. For example, these pairs of values are each 1 ULP apart: 0.0 and 5e-324 1.0 and 1.0000000000000002 1e300 and 1.0000000000000002e+300 So in these three cases, 1 ULP represents numeric differences of: 0.00000000000000000000...00005 0.0000000000000002 2000000000000000000000...000.0 respectively.
Just trying to follow along,
A good resource is Bruce Dawson's blog RandomASCII, if you don't mind the focus on C++. Start here: https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-number... -- Steve

On 01/14/2015 04:39 AM, Steven D'Aprano wrote:
I'm not sure why, but it seems like something here is out of place to me. Either it's not something that would come up, or it's something you would expect and handle differently. Or it's would be an error, signalling you need to handle it differently.
A difference of 0 ULP means they *may* be exactly equal. The calculation/representation just can't resolve to a finer amount, so two figures less than 1 ULP apart can get the same floating point representation. The two numbers still have a +- error. Possibly +-.5 ULP. ie... potentially 1 ULP apart.
Thanks, :) Ron

On Wed, Jan 14, 2015 at 03:18:17PM -0600, Ron Adam wrote:
On 01/14/2015 04:39 AM, Steven D'Aprano wrote:
[...]
You're going to have to explain what you think is "out of place". Of course doing division by zero is an error. The whole point of this discussion is that you cannot talk about errors relative to zero. At zero, you can still have an absolute error, but you cannot have a relative error. The definition of the error between two numeric values: def abserr(a, b): return abs(a - b) def relerr(a, b): return abs(a - b)/min(abs(a), abs(b)) If either a or b is zero, relarr() fails. This is a mathematical limitation, not a bug to be fixed.
I'm referring specifically to errors between two floats, not between a float and a theoretical exact real number. Sorry if that was not clear. E.g. there are two consecutive floats, 5e-324 and 1e-323, which are 1 ULP apart. But there is no such thing as a float for 8e-324. If you have some pencil and paper calculation which comes to 8e-324, then the error between that real value and either of those two floats is about 0.5 ULP. -- Steve

On 01/14/2015 06:04 PM, Steven D'Aprano wrote:
I would probably write like this. def relerr(actual, expected): return abs(actual - expected)/expected Where expected is a non-zero reference value. it needs a third value. In the above the expected value is the scaler. And then use it this way to measure a resistors tolerance. if relerr(218.345, 220) <= 0.05: print("Good") else: print("Bad") Note that you would never compare to an expected value of zero. relerr(a - b, expected_feet) < tolerance # relative feet from b relerr(a - 0, expected_feet) < tolerance # relative feet from zero relerr(a - b, ulp) # percentage of ulp's What Chris is looking for is a way to get a closeness function that works most of the time. (He posted while I'm writing this.) But I don't see how you can do that without specifying some scaler to give it context, and a tolerance. def is_close(a, b, unit, good): return (abs(a - b) / unit) <= good is_close(218.345, 220, 1, .05) # OHMs is_close(a, b, ULP, 2) # ULPs is_close(a, b, AU, .001) # astronomical units I don't see anyway to generalise those with just a function. By using objects we can do a bit more. I seem to recall coming across measurement objects some place. They keep a bit more context with them. Cheers, Ron

On Jan 14, 2015, at 18:13, Ron Adam <ron3200@gmail.com> wrote:
By using objects we can do a bit more. I seem to recall coming across measurement objects some place. They keep a bit more context with them.
You can use an interval class to represent your measurements with error bars, in which case many problems either go away, or become too obvious to ignore. In particular, for simple cases at least, you get error analysis for free, and you can just check whether the final interval includes the expected value (or overlaps the expected interval, if there are error bars on the expected result too). However, finding an interval math library that handles all of the mathematical operations you want for a particular application may not be easy. You might have to ctypes or wrap up a C or C++ library instead. (I'm just guessing, but I suspect more people have written wrappers for such libs to work with numpy/scipy than with native Python scalars.)

On Jan 14, 2015, at 18:27, Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
However, finding an interval math library that handles all of the mathematical operations you want for a particular application may not be easy. You might have to ctypes or wrap up a C or C++ library instead. (I'm just guessing, but I suspect more people have written wrappers for such libs to work with numpy/scipy than with native Python scalars.)
I found a few things on PyPI that look promising, but not a single one of them installs and runs for me on either 2.7 or 3.4. But I was able to fix up the first one pretty quickly; see https://github.com/abarnert/pyinterval for a fork. You'll need to install crlibm first, but that's trivial on at least Mac/Homebrew and Fedora, so I'm guessing most other *nix systems. One big thing it's missing is pow (and, in fact, all bivariate functions). Of course you can always define it as exp(log(base) * exponent), but IIRC that gives much wider error bars than a proper interval power function.

On Jan 14, 2015, at 6:14 PM, Ron Adam <ron3200@gmail.com> wrote:
Hmm, there is something to be said about making it clear which is "expected" -- then that value clearly provides the scale. And you can document that it can't be zero. And raise an Exception if it does.
it needs a third value.
Well, a tolerance, yes.
I don't see the advantage at all of this over a simple function API: Is_close(actual, expected, tolerance) (Probably with a default tolerance)
Note that you would never compare to an expected value of zero.
Yup --I like how you can make that clear. Though sometimes you really do expect zero. Maybe no need for the same API for that, or an API at all.
relerr(a - b, expected_feet) < tolerance # relative feet from b
I'm still confused why bother with the relerr function -- why bother?
relerr(a - b, ulp) # percentage of ulp's
I don't think you can define an ulp like that.
What Chris is looking for is a way to get a closeness function that works most of the time.
Exactly.
But I don't see how you can do that without specifying some scaler to give it context, and a tolerance.
Yes, you need a tolerance, sure. And the context comes from the values you are comparing. The only problem is what to do with zero. (And maybe really near zero).
I don't get why it helps to introduce units here. I don't think that's point (this would all be equally valid with unit less quantities). And if you want a scale for relative tolerance that is independent of the values you are checking, that's essentially an absolute tolerance. I'm not sure it's worth burying that simple calculation in a function.
I don't see anyway to generalise those with just a function.
Didn't just do that? Except for the Ulps, not sure what to do there, an ulp is a different way to define a delta. Unless you mean units aware, but that's really not what this all is about.
By using objects we can do a bit more. I seem to recall coming across measurement objects some place. They keep a bit more context with them.
Sure, there are various unit/physical quantities objects out there, many numpy-based. But I think is orthogonal to comparing floating point numbers. And a sime floating point comparison function _may_ belong the standard library, but not a more complex physical quantity system -Chris

On 01/14/2015 11:19 PM, Chris Barker - NOAA Federal wrote:
This is where a good but realistic example of the issue would help. Something you may actually see in a specific use case.
def is_close(a, b, unit, good): return (abs(a - b) / unit) <= good But I also think of these sort of things building as building blocks for those more complex systems. So it may help to consider how they may use these blocks.
I don't get why it helps to introduce units here. I don't think that's point (this would all be equally valid with unit less quantities).
I agree. The returned value I was thinking of should have been (abs(a - b) / b) / units <= good Or something like that.. The units here just scales the result. Without that you need to either scale the inputs, or scale the tolerance. In most cases when you are entering a tolerance, you will also be working with some multiple of units. But it seems that can confuse as much as it helps. In any case, I was just trying to come up with some practical example, and at the same time was thinking about how different units of measure can effect it.
By using objects we can do a bit more. I seem to recall coming across measurement objects some place. They keep a bit more context with them.
Yes, for the most part I agree, but think such functions would be useful in those more complex systems as well. Ron

On Jan 14, 2015, at 11:24 PM, Ron Adam <ron3200@gmail.com> wrote:
(abs(a - b) / b) / units <= good
If units and good are positive, and b is negative, this fails - got to use abs everywhere. But I'm sure both Steven and my code do that.
Without that you need to either scale the inputs, or scale the tolerance.
That's why we want a relative tolerance -- that scales the tolerance for you -- that's the whole point.
In most cases when you are entering a tolerance, you will also be working with some multiple of units. But it seems that can confuse as much as it helps.
Again, relative tolerance takes care of that for you- two values will have the same relative error whether they are stored in angstroms or meters -- the exponent will be wildly different, but the mantissas will be the same, so the relative error doesn't change. If you want to compare values In Meters According to a tolerance in angstroms, then you a) want absolute tolerance, and b) want a when units system.
In any case, I was just trying to come up with some practical example, and at the same time was thinking about how different units of measure can effect it.
Is_close(computed_value, expected_value, tolerance) Really does what you want, provided that the inputs are in the same units, and neither is zero. Tolerance is unitless. I'll take a look at Steven's code it sound like it's very much like where I was trying to go. -Chris

On Wed, Jan 14, 2015 at 08:13:42PM -0600, Ron Adam wrote:
On 01/14/2015 06:04 PM, Steven D'Aprano wrote:
[...]
No the function doesn't require a third value. It just returns the relative error between two values. What you do with that error is up to you: - compare it to an acceptable amount of error, as you do below - print it - add it to a list of error amounts etc. Of course, you could have other functions which do more, like decide whether an error is "small enough", but that is out of scope for a function which only calculates the error.
# using your implementation if relerr(actual=200.0, expected=-5.0) < 0.05: print("oops") Errors, by definition, are non-negative. You have to take care to ensure that the error value returned is non-negative. The question of which to use as the denominator is more subtle. Like you, I used to think that you should choose ahead of time which value was expected and which was actual, and divide by the actual. Or should that be the expected? I could never decide which I wanted: error relative to the expected, or error relative to the actual? And then I could never remember which order the two arguments went. Finally I read Bruce Dawson (I've already linked to his blog three or four times) and realised that he is correct and I was wrong. Error calculations should be symmetrical, so that error(a, b) == error(b, a) regardless of whether you have absolute or relative error. Furthermore, for safety you normally want the larger estimate of error, not the smaller: given the choice between (abs(a - b))/abs(a) versus (abs(a - b))/abs(b) you want the *larger* error estimate, which means the *smaller* denominator. That's the conservative way of doing it. A concrete example: given a=5 and b=7, we have: absolute error = 2 relative error (calculated relative to a) = 0.4 relative error (calculated relative to b) = 0.286 That is, b is off by 40% relative to a; or a is off by 28.6% relative to b. Or another way to put it, given that a is the "true" value, b is 40% too big; or if you prefer, 28.6% of b is in error. Whew! Percentages are hard! *wink* The conservative, "safe" way to handle this is to just treat the error function as symmetrical and always report the larger of the two relative errors (excluding the case where the denominator is 0, in which case the relative error is either 100% or it doesn't exist). Worst case, you may reject some values which you should accept, but you will never accept any values that you should reject.
Note that you would never compare to an expected value of zero.
You *cannot* compare to an expected value of zero, but you certainly can be in a situation where you would like to: math.sin(math.pi) should return 0.0, but doesn't, it returns 1.2246063538223773e-16 instead. What is the relative error of the sin function at x = math.pi?
I don't understand what you think these three examples are showing.
What Chris is looking for is a way to get a closeness function that works most of the time. (He posted while I'm writing this.)
I think the function I have in the statistics test suite is that function. I would like to see ULP calculations offered as well, but Mark thinks that's unnecessary and I'm not going to go to the battlements to fight for ULPs.
But I don't see how you can do that without specifying some scaler to give it context, and a tolerance.
By the way, it is spelled "scalar", and all that means is that it is a number, like 23, not a vector or array or matrix.
def is_close(a, b, unit, good): return (abs(a - b) / unit) <= good
Take a look at the statistics test suite. I'll be the first to admit that the error tolerances are plucked from thin air, based on what I think are "close enough", but they show how such a function might work: * you provide two values, and at least one of an absolute error tolerance and a relative error; * if the error is less than the error(s) you provided, the test passes, otherwise it fails; * NANs and INFs are handled apprpriately.
Generalise in what way?
By using objects we can do a bit more. I seem to recall coming across measurement objects some place. They keep a bit more context with them.
A full system of <value + unit> arithmetic is a *much* bigger problem than just calculating error estimates correctly, and should be a third-party library before even considering it for the std lib. -- Steve

On Wed, Jan 14, 2015 at 11:29 PM, Steven D'Aprano <steve@pearwood.info> wrote:
I'm on the fence about this -- it seems clear to me that if the user has specified an "expected" value, that tolerance would clearly be based on that magnitude. If nothing else, that is because you would more likely have many computed values for each expected value than the other way around. And I was thinking that calling the arguments something like "actual" and "expected" would make that clear in the API, and would certainly be documentable. But the fact that it was never clear to you , even as you were writing the code, is good evidence that it wouldn't be clear to everyone ;-) Error
calculations should be symmetrical, so that
error(a, b) == error(b, a)
That does make it simpler to think about and reason about, and makes the use-cased more universal (or at least appear more universal) "are these two values close" is a simple question to ask, if not to answer. regardless of whether you have absolute or relative error. Furthermore,
Which is what the Boost "strong" method does -- rather than compute teh max and use that, it computes both and does an "and" check -- but same result. A concrete example: given a=5 and b=7, we have:
The think is, in general, we use this to test for small errors, with low tolerance. Which value you use to scale only makes a big difference if the values are far apart, in which case the error will be larger than the tolerance anyway. In your above example, if the tolerance is, say, 1%, then if makes no difference which you use -- you are way off anyway. And in the common use cases, comparing a double precision floating point calculation, tolerances are more likely to be around 1e-12, not 1e-2 anyway! So I think that which relative tolerance you use makes little difference in practice, but it might as well be robust and symmetrical. (another option is to use the average of the two values to scale the tolerance, but why bother?)
there isn't one -- that's the whole point -- but there is an absolute error, so that's what you should check. We all agree a relative error involving zero is not defined / possible. So the question is what to do? 1) Raise a ValueError 2) Let it return "not close" regardless of the other input -- that's mathematically correct, nothing is relatively close to zero. 3) Automagically switch to an absolute tolerance near zero -- user specified what it should be. It seems the implementations (Boost's, for instance) I've seen simply do (2). But if the point of putting this in the standard library is that people will have something that can be used for common use cases without thinking about it, I think maybe (1) or (3) would be better. Probably (3), as raising an Exception would make a mess of this if it were inside a comprehension or something.
I'll take a look -- it does sound like you've already done pretty much what I have in mind.
I suppose it could be added later -- I agree that it could be pretty useful, but that it's also much harder to wrap your brain around, and really for a different use-case.
Is this different than the numpy implementation: http://docs.scipy.org/doc/numpy/reference/generated/numpy.allclose.html In that (according to the docs, I haven't looked at the code): """ The relative difference (*rtol* * abs(*b*)) and the absolute difference *atol* are added together to compare against the absolute difference between *a *and *b*. """ I think it should either be: - you specify atol or rtol, but only one is used. or - some way to transition from a relative tolerance to an absolute on near zero -- I a haven't figured out if that can be done smoothly yet. [Also, it looks like numpy computes the tolerance from the second input, rather than looking at both, resulting in an asymetric result -- discussed above.) I've always thought the numpy approach is weird, but now that I think about it, it would be really horrible (with defaults) for small numbers: rtol defaults to 1e-5, atol to 1e-8 -- too big, I think, but not the point here. In [23]: a, b = 1.1, 1.2 In [24]: np.allclose(a,b) Out[24]: False ## that's good there are pretty far apart In [27]: a, b = 1.1e15, 1.2e15 In [28]: np.allclose(a,b) Out[28]: False # same thing for large values -- still good. In [25]: a, b = 1.1e-15, 1.2e-15 In [26]: np.allclose(a,b) Out[26]: True OOPS! this is NOT what most people would expect!! In [30]: np.allclose(a,b, atol=0.0) Out[30]: False There we go. But with a default atol as large as 1e-8, this is a rally bad idea! I can only imagine whoever wrote this was thinking about really large values, but not really small values... (I think this has been brought up in the numpy community, but I'll make sure) -Chris
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 01/15/2015 01:29 AM, Steven D'Aprano wrote:
On Wed, Jan 14, 2015 at 08:13:42PM -0600, Ron Adam wrote:
Ewww the P word. :)
Consider two points that are a constant distance apart, but moving relative to zero. Their closeness doesn't change, but the relative error in respect to each other (and zero) does change. There is an implicit assumption that the number system used and the origin the numbers are measured from are chosen and relate to each other in some expected way. When ever you supply all the numbers, like in a test, it's not a problem, you just give good numbers.
A percentage of an expected distance. Error of two points compared to a specific distance. >>> relerr(5 - -5, 10) 0.0 I think unless you use decimal, the ulp example will either be zero or some large multiple of ulp.
Take a look at the statistics test suite.
I definitely will. :-)
I meant a function that would work in many places without giving some sort size and tolerance hints. Given two floating point numbers and noting else, I don't think you can tell if they represent something that is close without assuming some sort of context. At best, you need to assume the distance from zero and the numbers used are chosen to give a meaningful return value. While that can sometimes work, I don't think you can depend on it.
Yes, I agree. There are a few of them out there already. Cheers, Ron

On Thu, Jan 15, 2015 at 9:42 AM, Ron Adam <ron3200@gmail.com> wrote:
not sure what location two points are relative to zero means. Or if the numbers straddle zero? Good catch -- at least if the numbers are symmetric to zero -- i.e. a == -b, then they will never be "close" according to our working definition of relative tolerance: a = -b err == 2*a abs_tol = rel_tol * a #rel_tol is small and positive abs_tol is always smaller than a, so always smaller than the error, so never "close". In fact, this is true for any numbers that straddle zero: err == abs(a,b) abs_tol = rel_tol* max(abs(a), abs(b)) abs_tol <= err # provided 0< rel_tol < 1 So I was thinking the issues were near zero, but they actually are for any time: a <= 0 <= b (or b <= 0 <= a ) that is, any time they straddle or include zero. In this case, you need an absolute tolerance defined. Still figuring out how to use that and provide a smooth transition... Steven, does your code address this somehow? -Chris
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 01/15/2015 01:45 PM, Chris Barker wrote:
A number relative to zero is it's value. A number relative to another number is it's distance from the other number. If we divide that from one of the numbers, we are getting the ratio of their distance from each to the distance they are from zero. Seven suggest using the closer one as the divisor so you get a bigger result, but I prefer the larger one so I get a result between 0 and 1. A strategy for numbers of different signs may be to swap the sign on the smaller and add the smaller to the larger... err.. that isn't clear at all. How about this? def is_close(x, y, delta=.001): """ Check is x and y are close to each other. """ if x == y: return True # Can't get smaller than this. if x == 0 or y == 0: return False # Nothing is close to zero. if abs(x) < abs(y): # Make x the larger one. x, y = y, x if x * y < 0: # Signs differ. x = x - 2.0 * y # Move x and y same distance. y = -y return (x - y)/float(x) <= delta Cheers, Ron

On 01/15/2015 03:47 PM, Ron Adam wrote:
Remove the check of one being zero, it isn't needed. def is_close(x, y, delta=.001): """ Check is x and y are close to each other. """ if x == y: return True # Can't get smaller than this. if abs(x) < abs(y): # Make x the larger one. x, y = y, x if x * y < 0: # Signs differ. x = x - 2.0 * y # Move x and y same distance. y = -y return (x - y)/float(x) <= delta If one of them is zero, then you get a ratio of one. So a delta of 1 would mean everything is close. -Ron

On 01/15/2015 04:10 PM, Neil Girdhar wrote:
The problem with this is that if your "actual" is say, 5, then a large enough "estimate" will always be close.
def is_close(x, y, delta=.001): """ Check is x and y are close to each other. """ if x == y: return True # Can't get smaller than this. if abs(x) < abs(y): # Make x the larger one. x, y = y, x if x * y < 0: # Signs differ. x = x - 2.0 * y # Move x and y same distance. y = -y return (x - y)/float(x) <= delta Do you mean the y value? Some tests. is_close(0, 0, delta=0.001) --> True is_close(0, 1, delta=0.001) --> False is_close(1, 0, delta=0.001) --> False is_close(1000, 999, delta=0.001) --> True is_close(999, 1000, delta=0.001) --> True is_close(100, 95, delta=0.001) --> False is_close(95, 100, delta=0.001) --> False is_close(inf, 1, delta=0.001) --> False is_close(5, 3, delta=0.001) --> False is_close(5, 4.999, delta=0.001) --> True is_close(0, 0, delta=0.2) --> True is_close(0, 1, delta=0.2) --> False is_close(1, 0, delta=0.2) --> False is_close(1000, 999, delta=0.2) --> True is_close(999, 1000, delta=0.2) --> True is_close(100, 95, delta=0.2) --> True is_close(95, 100, delta=0.2) --> True is_close(inf, 1, delta=0.2) --> False is_close(5, 3, delta=0.2) --> False is_close(5, 4.999, delta=0.2) --> True Or do you mean this? is_close(0, 0, delta=5) --> True is_close(0, 1, delta=5) --> True is_close(1, 0, delta=5) --> True is_close(1000, 999, delta=5) --> True is_close(999, 1000, delta=5) --> True is_close(100, 95, delta=5) --> TrueBut it's probably the best you can do without adding a reference length. is_close(95, 100, delta=5) --> True is_close(inf, 1, delta=5) --> False is_close(5, 3, delta=5) --> True is_close(5, 4.999, delta=5) --> True A 1 is 100 percent the distance of the larger to 0. So every thing except infinity is smaller. A 5 is 500 percent, which is harmless as it's the same as 100 percent. The smaller of the two will be located from the larger amount and zero. It doesn't check for a negative delta, that should probably be an error. Cheers, Ron

On 01/15/2015 04:10 PM, Neil Girdhar wrote:
The problem with this is that if your "actual" is say, 5, then a large enough "estimate" will always be close.
Actually the problem is with numbers of different signs. The idea of normalising them in some way only works partially, and it doesn't fit mathematically. It seems to me if the signs are different, then relatively speaking, they are always far apart. Ron

You might be interested in my question: http://stackoverflow.com/questions/4028889/floating-point-equality-in-python On Monday, January 12, 2015 at 2:58:30 PM UTC-5, Chris Barker wrote:

On Wed, Jan 14, 2015 at 4:23 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
You might be interested in my question:
http://stackoverflow.com/questions/4028889/floating-point-equality-in-python
nothing new there, I'm afaid -- and no one seemed to have brought up the issue with zero. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

The point is that this function is already in Python and if you want to do something different, you should have a really good reason to do it differently. If you were to add a function to math, say math.close, it should work like numpy.allclose in my opinion. For reference, numpy does this: absolute(*a* - *b*) <= (*atol* + *rtol* * absolute(*b*)) where atol is an absolute tolerance and rtol is a relative tolerance (relative to the actual value b). This subsumes most of the proposals here. Best, Neil On Wed, Jan 14, 2015 at 7:48 PM, Chris Barker <chris.barker@noaa.gov> wrote:

On Thu, Jan 15, 2015 at 1:52 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
The point is that this function is already in Python
I don't think somethign being in an external package means that we have to do it the same way in teh stdlib -- even a widely used and well regarded package like numpy. And I say this as someone that has "import numpy" in maybe 90% of my python files. Maybe we should be careful to give it a very distinct name, however, to avoid confusion.
and if you want to do something different, you should have a really good reason to do it differently.
I'm not sure I agree, but we do in this case anyway. The truth is, while really smart people wrote numpy, many of the algorithms in there did not go through nearly the level of review currently required for the python standard library
adding atol in there "takes care of" the near zero and straddleing zero issue ( I suspect that's why it's done that way), but it is fatally wrong for values much less than 1.0 -- the atol totally overwhelms the rtol. See my post earlier today. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Thu, Jan 15, 2015 at 3:31 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
You can always disable atol by setting atol to zero. I really don't see what's wrong with their implementation.
1) the default should be zero in that case having a default close to the rtol default is asking for trouble. 2) if the user has both large and small numbers, there IS no appropriate value for a_tol for all of them. They simply should not be mixed in that way. -Chris
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Thu, Jan 15, 2015 at 6:36 PM, Chris Barker <chris.barker@noaa.gov> wrote:
It's not that close: rtol defaults to 1000 times bigger than atol.
2) if the user has both large and small numbers, there IS no appropriate value for a_tol for all of them.
The relative error is not symmetric, so it's not about having "large and small numbers". Your estimate should not affect the error you tolerate.

On Jan 15, 2015, at 3:47 PM, Neil Girdhar <mistersheik@gmail.com> wrote: On Thu, Jan 15, 2015 at 6:36 PM, Chris Barker <chris.barker@noaa.gov> wrote:
It's not that close: rtol defaults to 1000 times bigger than atol. That's pretty huge if your values are on order of 1e-100 ;-) Which is my point -- this approach only makes sense for values over order 1. Then the scaled relative error will be a lot larger, so atol sets a sort of lower bound on the error. But if the magnitude of your values is small, then the scaled value becomes small, and the atol overwhelms it. This kind of "switch to an absolute tolerance when close to zero" behavior is what I've been looking for, but I don't like how numpy does it. -Chris -CHB
2) if the user has both large and small numbers, there IS no appropriate value for a_tol for all of them.
The relative error is not symmetric, so it's not about having "large and small numbers". Your estimate should not affect the error you tolerate.

On Jan 15, 2015, at 3:31 PM, Neil Girdhar <mistersheik@gmail.com> wrote: absolute(*a* - *b*) <= (*atol* + *rtol* * absolute(*b*)) Oh, and if the numbers are small, then adding the absolute tolerance changes the tolerance significantly -- so you don't get what you expect there, either. Chris

You almost always want to use either an absolute tolerance or a relative tolerance. Here's why: Consider your estimated value a and your actual value b. If the estimate is taken to be the mean of a standard Gaussian distribution (the minimum assumptive distribution for a given mean and variance over the reals), then using an absolute tolerance is equivalent to verifying that the probability of observing b is within a interval with sufficient probability. Similarly, if the estimate is taken to be the mean of an standard exponential distribution (the minimum assumptive distribution for a given mean over the positive reals), then using a relative tolerance is equivalent to verifying the same thing. You almost always want one or the other. The symmetric error that people are proposing in this thread has no intuitive meaning to me. Best, Neil On Thu, Jan 15, 2015 at 6:47 PM, Chris Barker - NOAA Federal < chris.barker@noaa.gov> wrote:

On Jan 15, 2015, at 3:58 PM, Neil Girdhar <mistersheik@gmail.com> wrote: You almost always want to use either an absolute tolerance or a relative tolerance. Exactly -- only the user knows what is wanted. The trick is that relative tolerance is the most-of the time choice for float compares, but it's not appropriate (or even possible) for zero. So here we are grappling for a way to have sensible behavior at zero with otherwise relative tolerance. It's probably not possible to do so in the general case. So we should probably simply require the user to specify either and absolute or relative tolerance and be done with it. Nothing will be relatively close to zero except zero, which is fine, but should be mentioned in the docs. It sounds like this is what Steven's code does. I just haven't figured out how to look at it on a phone. -Chris Here's why: Consider your estimated value a and your actual value b. If the estimate is taken to be the mean of a standard Gaussian distribution (the minimum assumptive distribution for a given mean and variance over the reals), then using an absolute tolerance is equivalent to verifying that the probability of observing b is within a interval with sufficient probability. Similarly, if the estimate is taken to be the mean of an standard exponential distribution (the minimum assumptive distribution for a given mean over the positive reals), then using a relative tolerance is equivalent to verifying the same thing. You almost always want one or the other. The symmetric error that people are proposing in this thread has no intuitive meaning to me. Best, Neil On Thu, Jan 15, 2015 at 6:47 PM, Chris Barker - NOAA Federal < chris.barker@noaa.gov> wrote:

I'm not sure what Steven's code is, but his hypothesis: "Furthermore, for safety you normally want the larger estimate of error, not the smaller…" does not apply to the usual case where you have an actual value and an estimate. He seems to be describing a symmetric error function, which is not intuitive to me. Best, Neil On Thu, Jan 15, 2015 at 7:23 PM, Chris Barker - NOAA Federal < chris.barker@noaa.gov> wrote:

Neil Girdhar writes:
The symmetric error that people are proposing in this thread has no intuitive meaning to me.
There are many applications where the goal is to match two values, neither of which is the obvious standard (eg, statistical tests comparing populations, or even electrical circuits, where it may be important that two components be matched to within 1%, although the absolute value might be allowed to vary by up to 10%). Symmetric error is appropriate for those applications. Symmetric error may be less appropriate for applications where you want to hit an absolute value, but it's (provably) not too bad. By "provably not too bad" I mean that if you take the word "close" as a qualitative predicate, then although you can make the "distance" explode by taking the "actual" to be an order of magnitude distant in absolute units, you'll still judge it "not close" (just more so, but "more so" is meaningless in this qualitative context). On the other hand, for values that *are* close (with reasonable tolerances) it doesn't much matter which value you choose as the standard: "most" of the time you will get the "right" answer (and as the tolerance gets tighter, "most" tends to a limit of 100%). The generic "are_close()" function should be symmetric. I suppose it might also to useful to have an "is_close_to()" function that is asymmetric.

On Thu, Jan 15, 2015 at 10:28 PM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
No, if you're trying to answer the question whether two things belong to the same population as opposed to another, you should infer the population statistics based on a and b and a your estimated overall population statistics and then calculate cross entropies. Using some symmetric cross relative error has no meaning.
In statistics and machine learning at least many people have argued that the cross entropy error is the most reasonable loss function. When you have an observed value and estimated value, the right way of comparing them is a cross entropy error, and that's what absolute error and relative error are doing. They correspond to cross entropies of the minimum assumptive distributions over the reals and positive reals. I think the numpy.allclose function almost always gives you what you want when you have an actual and an estimated value, which is the more usual case.
I disagree. Since the usual case is to have an observed and estimated value, then the close function should not be symmetric. Either you should have two functions: relative error and absolute error, or you should combine them like numpy did. Best, Neil

Actually, I was wrong about the exponential distribution's KL divergence. It's the relative error (b-a)/b plus another term: log(b/a) — so I guess I don't see what relative error means except as a heuristic. Anyway, even if your symmetric error makes sense to you, does anyone already use it? If it were up to me, relative error would be b-a/b + log(b/a), but since no one uses that, I think it's a bad idea. On Thu, Jan 15, 2015 at 10:46 PM, Neil Girdhar <mistersheik@gmail.com> wrote:

On 01/16/2015 09:13 AM, Neil Girdhar wrote:
http://en.wikipedia.org/wiki/Approximation_error In this case, we are only confirming a single value is within an already determined range. In most cases that is enough to confirm the coding of the algorithm is correct, where as the more stringent methods you are thinking of is used to confirm the behaviour of the algorithm is valid. Cheers, Ron

On 1/12/2015 12:02 PM, Chris Barker wrote:
As near as I can tell, assertAlmostEqual(first, second, places=7, msg=None, delta=None) does one of two possible absolute difference checks (round(first, places) - round(second, places)) == 0.0 abs(first - second) < = delta There has been discussion of making it more complicated, but I remember that being rejected because other tests require more thought and can be implemented with the above anyway.
assertAlmostEqual((a-b)/d, 0, delta = tol) where d is a, b, and (a+b)/2 as one thinks is appropriate.
numpy provides allclose()
According to Neil Girdhar, absolute(/a/ - /b/) <= (/atol/ + /rtol/ * absolute(/b/)) which I presume means, in Python, abs(a-b) <= atol + rtol * abs(b) where atol and rtol are assume >= 0.0
(and isclose() ), which is pretty much what I'm suggesting.
Anyone else think this would be a good idea to add to the stdlib?
I am somewhat skeptical as there is no universal right answer (see below). But I think allclose is about as good as we could get, maybe with some special casing added. The discussion on the thread seems mostly divorced from the multiple use cases. What do each of a and b represent? Different numbers? or successive approximations of the same number? Are they theoretical 'exact' numbers, approximations of theoretical numbers, or calculations from measurements with error? If the latter, how big is the error? And why are we asking anyway? We are usually asking 'close enough' for some purpose, but purposes differ. Consider the problem of finding the (unique) 0 crossing (root) of a monotonic function f. One is looking for a value x* such that f(x*) is 'near' 0. (Someone claimed that 'nothing is close to zero'. This is nonsensical both in applied math and everyday life.) A standard approach is to compute successive approximations until one finds such a point. But unthinking application of such a method may not get one the desired result. The following are two examples with opposite problems. I once had the job of writing code to find the root of a monotonic function whose parameters were calculated from experimental data (multiple counts). Fortunately, it was known that the unique root (a probability) had to be between 0.0 and 1.0 (endpoints excluded). The function was often kinked (L-shaped) so that the slope at the root could be arbitrarily large. That meant that for a small enough absolute tolerance, there might be no float x such that x would be even 'near' 0, let alone equal to 0. And at other values, the slope could be so close to 0 as to make standard Newton iteration useless. I ended up avoiding float comparison by simply doing 20 steps of binary search, starting with 0 and 1 as the initial 'approximations', and stopping. The resulting 7 or 8 decimal digit approximation was more than good enough for realistic data. On the other hand, consider f(x) = x**9, Because f is so flat at its root (0) f(float) evaluates to -+0.0 for floats in a range of at least -1e-36, 1e-36. Is any number in this range 'good enough' as the root? Or is more work needed to find a value near the middle of the '= 0' range? -- Terry Jan Reedy

On Fri, Jan 16, 2015 at 11:09:16PM -0500, Terry Reedy wrote:
On 1/12/2015 12:02 PM, Chris Barker wrote:
That does nothing to solve problems A) and B), and I'm dubious that it provides a "significant figures comparison" (whatever that means, I'm pretty sure it doesn't mean "relative error", which is what you're calculating in a round-about fashion).
Adding the error tolerances together is a dubious thing to do. I don't understand the reasoning between that. Given two values a and b, there are two definitions of the error between them: absolute = abs(a - b) relative = abs(a - b)/abs(b) [Note: I'm sticking to numpy's unconditional use of "b" for the denominator, which is not symmetric. In actuality, I would use min(abs(a), abs(b)) for the denominator.] In the case where we only specify absolute or relative tolerance, it is obvious what to do: calculate the appropriate error, and if it is less than the given tolerance, return True: def approx_equal(a, b, allowed_absolute_error, allowed_relative_error): # For simplicity, ignore NANs, INFs, and assume b != 0 actual_error = abs(a - b) if allowed_absolute_error is None: # Only perform a check on relative error. return actual_error <= allowed_relative_error*abs(b) elif allowed_relative_error is None: # Only perform a check on absolute error. return actual_error <= allowed_absolute_error else: # We have specified *both* abs and rel error. How should we handle the third case? Two obvious ways come to mind: require that *both* individual tests pass: return (actual_error <= allowed_relative_error*abs(b) and actual_error <= allowed_absolute_error) # equivalent to: # actual_error <= max(allowed_relative_error*abs(b), # allowed_absolute_error) or require that *either* test pass: return (actual_relative_error <= allowed_relative_error or actual_absolute_error <= allowed_absolute_error) # equivalent to: # actual_error <= min( ... ) But what numpy does is to add the tolerances together, that is, it uses *twice* the average of them, equivalent to this: allowed_error = ( allowed_absolute_error + allowed_relative_error*abs(b) ) return actual_absolute_error <= allowed_error This means that numpy will claim that two numbers are close even though *both* the absolute and relative error tests fail: py> numpy.allclose([1.2], [1.0], 0.0, 0.1) # Fails absolute error test. False py> numpy.allclose([1.2], [1.0], 0.1, 0.0) # Fails relative error test. False py> numpy.allclose([1.2], [1.0], 0.1, 0.1) # Passes! True I cannot think of a good justification for that. Either I am missing something, or this goes to show that numpy can mess up even something as simple and straightforward as an error calculation. If I'm right, that's further evidence that getting this "right" and putting it in the standard library is a good thing to do. [...]
It isn't nonsensical, it just needs to be understood in context of relative errors. All non-zero numbers are infinitely far from zero in terms of relative error.
I didn't think that the well-known difficulties in root-finding has anything to do with the usefulness of a standard way to compare numbers for approximate equality. -- Steven

provides a "significant figures comparison" (whatever that means,
Poor choice of words on my part -- that is not a clearly defined term. I meant relative difference comparison, which does, more or less, indicate that two values are the same to within N significant figures. If you set the relative tolerance to 1e-6 then the values are "close" to about the first 6 decimal digits.
It ends up making the absolute tolerance essentially the "minimum tolerance". If the scaled relative tolerance is much larger than the absolute tolerance, it makes no difference. Likewise the other way around for small relative tolerance. It does change the result when the two are about the same magnitude, but this is all an order of magnitude thing anyway -- a factor of two doesn't really change the result. On a phone right now, but I guess that it was implemented that way to make it easy to vectorize with numpy arrays. If checks are a pain to vectorize. I think it's an OK way to do it , but very bad idea for the absolute tolerance to default to anything but zero. And really, it makes more sense to specify either absolute or relative, rather that mixing them. The numpy approach goes to heck for values smaller than zero. Essentially you need to know the scale to set the absolute tolerance anyway, killing the point of a relative tolerance.
I cannot think of a good justification for that. Either I am missing something,
See above
or this goes to show that numpy can mess up even something as simple and straightforward as an error calculation.
I'm not going to say that it messed up--but I certainly don't think that we should feel any obligation to do it the same way.
I do think this takes some thought, so good to put it in. But this thread also makes it clear that there a lot of ways to define "close", and which is best is use-case dependent. I think a straightforward relative tolerance would be usefull enough in many cases, and maybe an absolute one would be good for completeness. But I can see the argument that if there isn't one available, then folks are forced to think for themselves about what "closeness" means in their case, and be les likely to simply use the one provided when it might not be appropriate. (I need to go back and check some of my tests, for instance, I'm sure I didn't always think about the numpy abs_ tol parameter carefully!)
Agreed -- termination criteria of numerical methods is not the use case for a generic is_close implementation. That clearly requires a use-case specific criteria.

OK, I FINALLY got a chance to look at Steven's code in the statistic module tests. Not much code there, this really isn't hat big a deal. It does check for NaN, and inf and all that, so that's good. It is also symmetric with respect to x and y -- using the maximum of the two to compute the relative error -- I think that's good. (This is essentially the same as Boosts "strong" method -- though implemented a tiny bit differently). Here is the key definition: def approx_equal(x, y, tol=1e-12, rel=1e-7): ... x is approximately equal to y if the difference between them is less than an absolute error tol or a relative error rel, whichever is bigger. ... This is a lot like the numpy code, actually, except it does a max test, rather than adding the absolute and relative tolerances together. I think this is a better way to go than numpy's but there is little practical difference. However, it suffers from the same issue -- "tol" is essentially a minimum error that is considered acceptable. This is nice, as it it will allow zero to be passed in, and if the other input is within tol of zero, it will be considered approximately equal. However, for very small numbers (less that the absolute tolerance), then they will always be considered approximately equal: In [18]: approx_equal(1.0e-14, 2.0e-14) Out[18]: True off by a factor of 2 In [19]: approx_equal(1.0e-20, 2.0e-25) Out[19]: True oops! way off! This is with the defaults of course, and all you need to do is set teh tol much lower: In [20]: approx_equal(1.0e-20, 2.0e-25, tol=1e-25) Out[20]: False This is less fatal than with numpy, as with numpy you are processing a whole array of numbers with the same tolerances, and they may not be all of the same magnitude. But I think think it's trap for users. My proposal: Allow either an absolute or relative tolerance, but not try to do both in one call. or If you really want the ability to do both at once (i.e. set a minimum for the zero case), then: - make the default absolute tolerance zero -- fewer surprises that way - document the absolute tolerance as a mimimum error (difference), and specifically mention the zero case in the docs. Otherwise, go with Steven's code, and put it in the math module. Also -- there was some talk of what do do with complex -- I say two complex numbers are approx_equal if approx_equal(z1.real, z2.real) and approx_equal(z1.imag, z2.imag) -- that is more rigorous a test than using the complex abs value of the difference. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Mon, 12 Jan 2015 12:04:47 -0800 Guido van Rossum <guido@python.org> wrote:
So is 1e-100 close to 1e-50 with the default tolerance? What about 1e+100 and (1e+100 plus one ulp)?
Indeed, there are different ways to express such a requirement. Numpy uses a combination of relative and absolute difference (see http://docs.scipy.org/doc/numpy/reference/generated/numpy.isclose.html#numpy...), but you may also want a maximum number of ulps differing. You also need special code for infinities and the like. (and in some cases you may want to distinguish positive and negative zeros, even though they are infinitely close :-)) Regards Antoine.

On Mon, Jan 12, 2015 at 12:04 PM, Guido van Rossum <guido@python.org> wrote:
So is 1e-100 close to 1e-50 with the default tolerance? What about 1e+100 and (1e+100 plus one ulp)?
sorry I should have not specified a specific default, and/or explained more what I think it should mean. But I thought the first question was "might this be a good idea for the standard library" -- and only if so, then we can work out how to do it. But anyway, the while point is that it would be some version "relative error", NOT an absolute tolerance -- that is pretty easy to write. See Nathaniel's note for the too many options already in numpy. Nathaniel Smith wrote: Unfortunately this opens a tremendous can of worms. Well, yes, but something generally useful for many cases would still be nice. Boost has thought about this a lot and advocates a slightly different
It looks like they start with two, but end up with "the implementation is using modified version of the equations (1) and (2) where all underflow, overflow conditions could be guarded safely" That looks good to me, and the origins in Knuth are a good sign. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Mon, Jan 12, 2015 at 5:02 PM, Chris Barker <chris.barker@noaa.gov> wrote:
Unfortunately this opens a tremendous can of worms. Numpy actually provides: allclose/assert_allclose -- absolute + relative tolerance, not symmetric assert_approx_equal -- agreement to n digits (don't use this) assert_array_max_ulp, assert_array_almost_equal_nulp -- tolerance based on counting ulps. Something like a relative tolerance of n*2^-53 for doubles, but it depends on datatype bits and acts differently near zero. I have no idea what the difference between these two functions is. I've recently argued that allclose should actually act slightly differently: http://mail.scipy.org/pipermail/numpy-discussion/2014-July/070639.html Boost has thought about this a lot and advocates a slightly different definition (actually, two slightly different definitions) from any of the above: http://www.boost.org/doc/libs/1_34_0/libs/test/doc/components/test_tools/flo... -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org

On Mon, Jan 12, 2015 at 09:02:17AM -0800, Chris Barker wrote:
I do, and I have already done so! It's an implementation detail of the statistics module (to be specific, its test suite), but it covers both relative and absolute error tolerances and handles infinities and NANs. https://hg.python.org/cpython/file/1b145e8ae4be/Lib/test/test_statistics.py#... The default tolerances are more or less plucked out of thin air and probably should be discussed. Ideally it should also handle ULP comparisons: https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-number... Unfortunately a naive ULP comparison has trouble with NANs, INFs, and numbers close to zero, especially if they have opposite signs. The smallest representable denormalised floats larger, and smaller, than zero are: 5e-324 -5e-324 These are the smallest magnitude floats apart from zero, so we might hope that they are considered "close together", but they actually differ by 9223372036854775808 ULP. Ouch. I have some ideas for dealing with that, and if anyone is interested I'm happy to talk about it, but they're not ready for production yet. I think that the Bruce Dawson is right. Floating point comparisons are hard, really hard. I know that I've still got a lot to learn about it. I can think of at least five different ways to compare floats for equality, and they all have their uses: - exact equality using == - absolute error tolerances - relative error tolerances - ULP comparisons - the method unittest uses, using round() I'm explicitly including == because it is a floating point superstition that one should never under any circumstances compare floats for exact equality. As general advice, "don't use == unless you know what you are doing" is quite reasonable, but it's the "never use" that turns it into superstition. As Bruce Dawson says, "Floating-point numbers aren’t cursed", and throwing epsilons into a problem where no epsilon is needed is a bad idea. https://randomascii.wordpress.com/2012/06/26/doubles-are-not-floats-so-dont-... Aside: I'm reminded of APL, which mandates fuzzy equality (i.e. with a tolerance) of floating point numbers: In an early talk Ken [Iverson] was explaining the advantages of tolerant comparison. A member of the audience asked incredulously, “Surely you don’t mean that when A=B and B=C, A may not equal C?” Without skipping a beat, Ken replied, “Any carpenter knows that!” and went on to the next question. - Paul Berry -- Steve

On Tue, Jan 13, 2015 at 1:34 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Only with a naive (i.e., wrong :-) implementation. Those two floats differ by precisely 2 units in the last place, and any correct implementation should report that. It's not hard to write code that deals correctly with opposite signs. Here's a simple difference_in_ulps function that correctly reports the number of ulps difference between any two finite floats.
import struct
def to_ulps(x):
... n = struct.unpack('<q', struct.pack('<d', x))[0] ... return -(n + 2**63) if n < 0 else n ...
def difference_in_ulps(x, y):
... return abs(to_ulps(x) - to_ulps(y)) ...
difference_in_ulps(-5e-324, 5e-324)
2 This is almost exactly what's in Lib/test/test_math.py already, except that the function there is better documented and uses "~(n + 2**63)" instead of "-(n + 2**63)" in the negative n correction branch, which has the effect of regarding 0.0 and -0.0 as 1 ulp apart. Comparing by ulps was what I needed for testing library-quality functions for the math and cmath modules; I doubt that it's what's needed for most comparison tasks. I'd expect the suggested combination of relative error and absolute error to be more appropriate most of the time. -- Mark

On Tue, Jan 13, 2015 at 09:28:39AM +0000, Mark Dickinson wrote:
Well, I said I was still learning :-) In my defence, I was influenced by Bruce Dawson who said you can't do ulps comparisons if the signs are not the same.
How do you deal with cases where one number is normalised and the other is denormalised? E.g. I think most people would consider these two numbers to be close together: py> to_ulps(2.2250738585072014e-308) - to_ulps(1.1125369292536007e-308) 2251799813685248 -- Steve

On Jan 13, 2015, at 1:29 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
Comparing by ulps was what I needed for testing library-quality functions for the math and cmath modules; I doubt that it's what's needed for most comparison tasks.
That's the conclusion I was coming to. Ulps are likely to be the right way to do it if your trying to understand/test the accuracy of an algorithm, but not for general "did I get a close enough result". And it would be a lot harder to understand for most of us. As for comparing to zero -- in reading about this, it seems there simply is no general solution -- only the user knows what they want. So the only thing to do is a big warning in the docs about it, and providing an absolute tolerance option. Should that a separate function or a flag? This is actually a harder problem for numpy, as it's an array function, so you need to have the same function/parameters for every value in the array, some of which may be near zero. I haven't thought it out yet, but maybe we could specify an absolute tolerance near zero, and a relative tolerance elsewhere, both at once. Tricky to document, even if possible.
I'd expect the suggested combination of relative error and absolute error to be more appropriate most of the time.
And most of the time is what we are going for. -Chris

On 01/13/2015 09:53 AM, Chris Barker - NOAA Federal wrote:
Doesn't this problem come up at any boundary comparison, and not just zero? So isn't the issue about any n distance from any floating point number that is less than 1 ulp? And in that regard, comparison to zero is no different than any comparison to any other floating point value? Just trying to follow along, -Ron
I'd expect the suggested combination of relative error and absolute error to be more appropriate most of the time. And most of the time is what we are going for.

Zero is special because you lose the ability to use a relative tolerance. Everything is huge compared to zero.
So isn't the issue about any n distance from any floating point number that is less than 1 ulp?
I'm still a bit fuzzy on Ulps, but it seems the goal here is to define a tolerance larger than an ulp. This is for the use case where we expect multiple rounding errors -- many more than one ulp, That's why I think the use case for ulp comparisons is more about assessment of accuracy of algorithms than "did I introduce a big old bug?" or, "is this computed value close enough to what I measured?" -Chris

On 01/13/2015 09:25 PM, Chris Barker - NOAA Federal wrote:
On Jan 13, 2015, at 6:58 PM, Ron Adam<ron3200@gmail.com> wrote:
Doesn't this problem come up at any boundary comparison, and not just zero?
Zero is special because you lose the ability to use a relative tolerance. Everything is huge compared to zero.
After I posted I realised that when you compare anything you subtract what you are comparing to, and if it's equal to zero, then it's equal to what you are comparing to. So testing against zero is fundamental to all comparisons, is this is correct? Wouldn't a relative tolerance be set relative to some value that is *not* zero? And then used when comparing any close values, including zero.
I haven't looked into the finer aspects of ulps myself. It seems to me ulps only matter if the exponent part of two floating point numbers are equal and the value part is within 1 (or a small few) ulps of each other. Then there may be problems determining if they are equal, or one is greater or less than the other. And only if there is no greater tolerance value set. A couple of thoughts come to mind. I'm not sure if they are relevant though. I think for math algorithms that are hard coded in a program, it isn't much of an issue as the author would have a feel for the size of a delta if needed, and output of an appropriate number of significant digits. and can calculate the error range if needed as well. That probably fits most situations and is what is typically done. It seems to me, that automated tracking and/or use of these things may be wanted with equation solvers. The solver would determine the delta and significant digits from it's initial data. That sounds like it could get very complex, but maybe making this easier to do is the point? Cheers, Ron

On Jan 13, 2015, at 21:12, Ron Adam <ron3200@gmail.com> wrote:
I think you're missing the meanings of these terms. Absolute tolerance means a fixed tolerance, like 1e-5; relative tolerance means you pick a tolerance that's relative to the values being compared--think of it as a percentage, say, 0.01%. Plenty of values are within +/- 1e-5 of 0. But the only value within +/- 0.01% of 0 is 0 itself. Another way to look at it: x is close to y with absolute tolerance 1e-5 if abs(x-y) < 1e-5. x is close to y with relative tolerance 1e-5 if abs(x-y)/y < 1e-5. So, no value is ever close to 0 within any relative tolerance.
No, this is wrong. First, 1.1e1 and 1.0e10 (binary) are only off by 1 ulp even though they have different exponents. Second, there's never any problem determining if two finite numbers are equal or which one is greater. The issue is determining whether two numbers +/- their error bars are too close to call vs. unambiguously greater or lesser. For example, if I get 1.0e1 and 1.1e1 (assuming 1 bit mantissa for ease of discussion), the latter is clearly greater--but if I have 2 ulp of error, or an absolute error of 1e1, or a relative error of 50%, the fact that the latter is greater is irrelevant--each value is within the other value's error range. But if I get 1.0e1 and 1.0e10 with the same error, then I can say that the latter is unambiguously greater. I think Chris is right that ulp comparisons usually only come up in testing an algorithm. You have to actually do an error analysis, and you have to have input data with a precision specified in ulp, or you're not going to get a tolerance in ulp. When you want to verify that you've correctly implemented an algorithm that guarantees to multiply input ulp by no more than 4, you can feed in numbers with +/- 1 ulp error (just typing in decimal numbers does that) and verify that the results are within +/- 4 ulp. But when you have real data, or inherent rounding issues, or an error analysis that's partly made up of rules of thumb and winging it, you're almost always going to end up with absolute or relative error instead. (Or, occasionally, something more complicated that you have to code up manually, like logarithmic relative error.)

On 01/14/2015 12:10 AM, Andrew Barnert wrote:
I see your point about the exponent. To be clear, Are we referring to significant digits here, or ... 1.0000...1e1 and 1.0000...0e10 Where the ... is a lot of zero's to the max digits pythons floating point can handle on what ever platform it's running on. What I've read indicates ULP usually refers to the limits of the implementation/device. Significant digits has more to do with error of measurements, (and estimates), while ULPs is about accuracy limits of the hardware/software ability to calculate.
You would use the larger of the three. And possibly give a warning if the 2 ulp error is the largest. (if the application is set to do so.) I presuming the 2 ulp is twice the limit of the floating point precision here. 50% accuracy of data 1e1 limit of significant digits/measurement 2 ulp twice floating point unit of least precision
Yes, This was the point I was alluding to earlier.
If the algorithm doesn't track error accumulation, then yes. This is interesting but I'm going to search for some examples of how to use some of this. I'm not sure I can add to the conversation much, but thanks for taking the time to explain some of it. Cheers, Ron

On Jan 14, 2015, at 1:08, Ron Adam <ron3200@gmail.com> wrote:
No, those are pretty far apart. We're referring to 1.1111…1e1 and 1.0000…e10 I think your confusion here is entirely my fault. For simplicity, it's often helpful to look at tiny float representations--e.g., a 4-bit float with 1 sign bit, 1 mantissa bit, and 2 exponent bits (that, if both 1, mean inf/nan), because writing 51 0's tends to obscure what you're looking at. But it's pretty stupid to do that without mentioning that you're doing so, or how it extends to larger representations if non-obvious, and I think I was just that stupid.
What I've read indicates ULP usually refers to the limits of the implementation/device.
Yes, it means "Unit of Least Precision" or "Unit in Least Place". There are a few ways to define this, but one definition is: Ignoring zeroes and denormals, two numbers are 1 ulp apart if they're finite and have the same sign and either (a) they have the same exponent and a mantissa that differs by one, or (b) they have an exponent that differs by one, the smaller one has the max mantissa and the larger the min mantissa. For zeroes, you can either define pos and neg zero as 1 ulp from each other and from the smallest denormal of the same sign, or as 0 ulp from each other and both 1 ulp from the smallest denormal of either sign.
Significant digits has more to do with error of measurements, (and estimates), while ULPs is about accuracy limits of the hardware/software ability to calculate.
And, importantly, to represent your values in the first place. If you have a value that's, say, exactly 0.3 (or, for that matter, 0.3 to 28 significant digits), +/- 1 ulp is larger than your measurement error, but it's the minimum error range you can store.l
You use whichever is/are relevant to your error analysis and ignore the others. (Technically I guess you could say you're just using 0 as the error for the two you don't care about, and then it's guaranteed that the one you do care about is largest, but I don't think that's the way you'd normally think about it.) Also, you have to be careful about how that extends to inequality.
I presuming the 2 ulp is twice the limit of the floating point precision here.
Yes, 2 ulp means off by 2 units of least precision. Of course for binary, that actually means off by 1 in the unit of penultimate precision.

On 01/14/2015 04:18 PM, Steven D'Aprano wrote:
Speaking of which: https://pypi.python.org/pypi/bigfloat/0.3.0 I know it's /called/ bigfloat, but it /looks/ like you can specify however many/few bits you want. ;) -- ~Ethan~

On Wednesday, January 14, 2015 4:18 PM, Steven D'Aprano <steve@pearwood.info> wrote:
I think a 4-bit float is a bit *too* mini, 8-bit sounds about right.
Anything up to 5 or 6 bits, you can fit a table of all the values on your monitor at decent font size. 8 bits is a bit too much for that. Which reminds me, I found an old blog post where I used such a 6-bit (1+3+2) format to demonstrate all the details of IEEE binary floats, including exactly such a table. It's now back online (http://stupidpythonideas.blogspot.com/2015/01/ieee-floats-and-python.html?vi...) if anyone's interested. I had a Binary6 class to go with it—which didn't have the complete float interface, but did have a lot of handy optionals like as_tuple and next_plus, as Decimal does—but I can no longer find that, so I stripped out that section of the blog post. But it still shows how to fiddle with the bits manually using struct plus integer operations or a library like bitstring.
(I know, I know, it should be a third-party library, not a built-in :-)
I don't think it would be very useful as a third-party lib either. But an arbitrary-precision IEEE 754-2008 Binary type, with all the optional features that aren't part of float, akin to decimal.Decimal—that might be useful. And while you're at it (you were volunteering, right?), want to go over 754-2008 vs. 854-1987 and see if there's anything worth adding to Decimal? :)

On Wed, Jan 14, 2015 at 1:08 AM, Ron Adam <ron3200@gmail.com> wrote:
Significant digits has more to do with error of measurements, (and estimates),
right -- and relative tolerance is kinda-sorta like significant digits. i.e a relative tolerance of 1e-4 is like saying the same to 4 (decimal) significant digits. which brings up the issue with 0.0 -- how many significant digits does 0.0 have? does 0.000001234 have the same significant digits as 0.0 ? -- its not really defined. whereas it's fairly straightforward to say that: 0.000001234 and 0.000001233 are the same to 3 significant digits, but differ in the forth. and note that: In [46]: u, v = 0.000001234, 0.000001233 In [47]: err = abs(u-v) In [48]: err Out[48]: 9.999999999998634e-10 so the absolute error is less than 1e-9 == pretty "small" -- but is that what we generally want? no. In [49]: err <= 1e-3*abs(u) Out[49]: True but the error is less than a relative tolerance of 1e-3 (three sig figs) In [50]: err <= 1e-4*abs(v) Out[50]: False and greater than a relative tolerance of 1e-4 (not four sig figs) Again, this all works great when you are away from zero (and maybe need to handle NaN and inf carefully too), but if we simply put this in the stdlib, then folks may use it with a zero value, and not get what they expect. My thinking now: set a "zero_tolerance",which will default to the relative tolerance, but be user-settable. If one of the input values is zero, then use the zero_tolerance as an absolute tolerance, if not, then use a relative tolerance. I think this would make for fewer surprises, and make it easier to use the same function call for a wide range of values, some of which may be zero. What I haven't figure out yet is how(or if) to make sure that the transition is continuous -- do we only use zero_tol if one of the values is exactly zero? of if one or both of the values is less than zero_tol? It seems that if you say, for instance that: 1e-12 is "close" to zero, then 1e-12 should also be "close" to any value less than 1e-12. But if 1e-12 is "close" to 1e-14 (for example), then 1e-12 should probably be "close" to 1.00000000001 also, but it wouldn't be, if we did an abrupt change to relative tolerance for any value >= the zero_tolerance. So more to think out here -- feel free to chime in. I've been playing with this gist, though it's only a few real lines of code anyway: https://gist.github.com/PythonCHB/6e9ef7732a9074d9337a
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 01/14/2015 10:44 AM, Chris Barker wrote:
It has as many significant places as your data is measured or estimated to. If your data is entered with four places, then 0.0 is actually 0.0000. And a value displayed as 1.1 that was entered as 1.1000 would still have 4 significant places.
does 0.000001234 have the same significant digits as 0.0 ? -- its not really defined.
Well, if the significant digits/places was 4 places, those would be considered equal. Where 0,000001234 is a value after a calculation is done. And the 0,0 is actually 0.0000. If the significant places used as actually 9, then the 0,0 would be 0,000000000. Python (and most languages) truncate the zero's to the right. So you can't go by what is displayed, it needs to be explicitly set. Then you would use that to determine the error range of two numbers, rounding to significant digits and calculating your +- error. Percentage error ranges are something completely different. You need to calculate that separately. And set it based on your data. And also take into account significant digits. And taking into account ULPs is another thing. If ULPs become larger than significant digits or a relative error, then maybe a warning (or error) needs to be issued. Then you can use an alternative algorithm, or modify the algorithm to avoid the warning or error, and get a valid result. And possibly python should do none of these by default so they are always explicitly ask for behaviours.
I think it depends if you are trying to measure the error in your calculation, the error of your answer based on your data accuracy, or the error of the floating point implementation. Each is a different problem. A calculation may use constants that are less precise than what is possible by floating point, but more precise than the data. So in this case, it has a limit above what is introduced by ULPs.
If python worked more like a calculator and you actually set the significant figures, this would be relevant. But since it doesn't, you need to track significant figures separately and use that value, not just use the represented value which may have truncated zero's. Then zero is not different than other numbers.
What do you think of having a calculator mode or module, which would display specified significant digits in the output, but keep the entire number internally? import calc calc.fix(4) calc.print(epxr) And possibly have other general features that can be useful in other more specialised modules.
On the parts I think I know.. OK. ;-)
I'll take a look later today. Cheers, Ron

On 01/14/2015 12:04 PM, Ron Adam wrote:
If 0.000001234 has four significant digits, those digits are the ending 1234, not the first four zeros. The leading zeros would only be significant if there was a leadinger non-zero ;) -- and then the zeros would no longer be leading. -- ~Ethan~

On Jan 14, 2015, at 12:04, Ron Adam <ron3200@gmail.com> wrote:
First, why use a "calc" mode instead of just format('.4')? Also, this only works when the number of significant digits is small enough to fit in 52 bits; beyond that, it's misleading. Most calculators do something like show 8 decimal digits and store 10 decimal digits, which doesn't have this problem. And in fact, I think in any case where format isn't sufficient, the decimal module is what you need. (In fact, even with 4 digits it's misleading; two different values can both print identically but there's no function to check that except to stringify them and compare the strings. Calculators that display 8 digits but use 10 for intermediate results usually do comparisons on the 8 digits.) At any rate, switching to decimal makes it easier to think about and see the rounding errors, and eliminates any additional errors caused by going back and forth to decimal strings (and notice inputs are often decimal strings, with precision represented in decimal significant figures). But ultimately you still have the same problems of tracking the accumulation of error and determining the appropriate relative or absolute (or ulp or log2-relative or whatever) tolerance for equality and inequality comparisons.

On Tue, Jan 13, 2015 at 08:57:38PM -0600, Ron Adam wrote:
On 01/13/2015 09:53 AM, Chris Barker - NOAA Federal wrote:
[...]
No. A quick refresher on error tolerances... Suppose you have a value which should be exactly 0.5, but you calculate it as 0.51. Then the absolute error is 0.51-0.5 = 0.01, and the relative error is 0.01/0.5 = 0.02 (or 2%). But consider two values, 0.0 and 0.1. Then the absolute error is 0.1-0.0 = 0.1, and the relative error is 0.1/0.0 which is infinite. So 0.0 and -0.0 are problematic when dealing with relative errors.
No. 1 ULP (Unit In Last Place) is the smallest possible difference between two floats. A difference of 0 ULP means the two floats are exactly equal. How it works: in mathematics, real numbers are continuous, but floats are not. There are only 2**64 floats in Python (a C double), less if you ignore the NANs and INFs, which means we can conveniently enumerate them from -(2**64) to (2**64-1), based on the internal structure of a float. So if you convert two floats into this enumerated integer value (which is equivalent to doing a type-cast from a C double to a C long) and subtract the two ints, this gives you a measure of how far apart they are. (As Mark mentioned earlier, you have to make allowance for negative floats, also INF and NANs are problematic too.) If two values are exactly equal, their "distance apart" in ULP will be zero. A distance of 1 ULP means they are consecutive floats, they cannot possibly be any closer without being equal. A distance of 2 ULP means there is only a single float separating them, and so on. Note that ULP do not directly correspond to a numeric tolerance. For example, these pairs of values are each 1 ULP apart: 0.0 and 5e-324 1.0 and 1.0000000000000002 1e300 and 1.0000000000000002e+300 So in these three cases, 1 ULP represents numeric differences of: 0.00000000000000000000...00005 0.0000000000000002 2000000000000000000000...000.0 respectively.
Just trying to follow along,
A good resource is Bruce Dawson's blog RandomASCII, if you don't mind the focus on C++. Start here: https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-number... -- Steve

On 01/14/2015 04:39 AM, Steven D'Aprano wrote:
I'm not sure why, but it seems like something here is out of place to me. Either it's not something that would come up, or it's something you would expect and handle differently. Or it's would be an error, signalling you need to handle it differently.
A difference of 0 ULP means they *may* be exactly equal. The calculation/representation just can't resolve to a finer amount, so two figures less than 1 ULP apart can get the same floating point representation. The two numbers still have a +- error. Possibly +-.5 ULP. ie... potentially 1 ULP apart.
Thanks, :) Ron

On Wed, Jan 14, 2015 at 03:18:17PM -0600, Ron Adam wrote:
On 01/14/2015 04:39 AM, Steven D'Aprano wrote:
[...]
You're going to have to explain what you think is "out of place". Of course doing division by zero is an error. The whole point of this discussion is that you cannot talk about errors relative to zero. At zero, you can still have an absolute error, but you cannot have a relative error. The definition of the error between two numeric values: def abserr(a, b): return abs(a - b) def relerr(a, b): return abs(a - b)/min(abs(a), abs(b)) If either a or b is zero, relarr() fails. This is a mathematical limitation, not a bug to be fixed.
I'm referring specifically to errors between two floats, not between a float and a theoretical exact real number. Sorry if that was not clear. E.g. there are two consecutive floats, 5e-324 and 1e-323, which are 1 ULP apart. But there is no such thing as a float for 8e-324. If you have some pencil and paper calculation which comes to 8e-324, then the error between that real value and either of those two floats is about 0.5 ULP. -- Steve

On 01/14/2015 06:04 PM, Steven D'Aprano wrote:
I would probably write like this. def relerr(actual, expected): return abs(actual - expected)/expected Where expected is a non-zero reference value. it needs a third value. In the above the expected value is the scaler. And then use it this way to measure a resistors tolerance. if relerr(218.345, 220) <= 0.05: print("Good") else: print("Bad") Note that you would never compare to an expected value of zero. relerr(a - b, expected_feet) < tolerance # relative feet from b relerr(a - 0, expected_feet) < tolerance # relative feet from zero relerr(a - b, ulp) # percentage of ulp's What Chris is looking for is a way to get a closeness function that works most of the time. (He posted while I'm writing this.) But I don't see how you can do that without specifying some scaler to give it context, and a tolerance. def is_close(a, b, unit, good): return (abs(a - b) / unit) <= good is_close(218.345, 220, 1, .05) # OHMs is_close(a, b, ULP, 2) # ULPs is_close(a, b, AU, .001) # astronomical units I don't see anyway to generalise those with just a function. By using objects we can do a bit more. I seem to recall coming across measurement objects some place. They keep a bit more context with them. Cheers, Ron

On Jan 14, 2015, at 18:13, Ron Adam <ron3200@gmail.com> wrote:
By using objects we can do a bit more. I seem to recall coming across measurement objects some place. They keep a bit more context with them.
You can use an interval class to represent your measurements with error bars, in which case many problems either go away, or become too obvious to ignore. In particular, for simple cases at least, you get error analysis for free, and you can just check whether the final interval includes the expected value (or overlaps the expected interval, if there are error bars on the expected result too). However, finding an interval math library that handles all of the mathematical operations you want for a particular application may not be easy. You might have to ctypes or wrap up a C or C++ library instead. (I'm just guessing, but I suspect more people have written wrappers for such libs to work with numpy/scipy than with native Python scalars.)

On Jan 14, 2015, at 18:27, Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
However, finding an interval math library that handles all of the mathematical operations you want for a particular application may not be easy. You might have to ctypes or wrap up a C or C++ library instead. (I'm just guessing, but I suspect more people have written wrappers for such libs to work with numpy/scipy than with native Python scalars.)
I found a few things on PyPI that look promising, but not a single one of them installs and runs for me on either 2.7 or 3.4. But I was able to fix up the first one pretty quickly; see https://github.com/abarnert/pyinterval for a fork. You'll need to install crlibm first, but that's trivial on at least Mac/Homebrew and Fedora, so I'm guessing most other *nix systems. One big thing it's missing is pow (and, in fact, all bivariate functions). Of course you can always define it as exp(log(base) * exponent), but IIRC that gives much wider error bars than a proper interval power function.

On Jan 14, 2015, at 6:14 PM, Ron Adam <ron3200@gmail.com> wrote:
Hmm, there is something to be said about making it clear which is "expected" -- then that value clearly provides the scale. And you can document that it can't be zero. And raise an Exception if it does.
it needs a third value.
Well, a tolerance, yes.
I don't see the advantage at all of this over a simple function API: Is_close(actual, expected, tolerance) (Probably with a default tolerance)
Note that you would never compare to an expected value of zero.
Yup --I like how you can make that clear. Though sometimes you really do expect zero. Maybe no need for the same API for that, or an API at all.
relerr(a - b, expected_feet) < tolerance # relative feet from b
I'm still confused why bother with the relerr function -- why bother?
relerr(a - b, ulp) # percentage of ulp's
I don't think you can define an ulp like that.
What Chris is looking for is a way to get a closeness function that works most of the time.
Exactly.
But I don't see how you can do that without specifying some scaler to give it context, and a tolerance.
Yes, you need a tolerance, sure. And the context comes from the values you are comparing. The only problem is what to do with zero. (And maybe really near zero).
I don't get why it helps to introduce units here. I don't think that's point (this would all be equally valid with unit less quantities). And if you want a scale for relative tolerance that is independent of the values you are checking, that's essentially an absolute tolerance. I'm not sure it's worth burying that simple calculation in a function.
I don't see anyway to generalise those with just a function.
Didn't just do that? Except for the Ulps, not sure what to do there, an ulp is a different way to define a delta. Unless you mean units aware, but that's really not what this all is about.
By using objects we can do a bit more. I seem to recall coming across measurement objects some place. They keep a bit more context with them.
Sure, there are various unit/physical quantities objects out there, many numpy-based. But I think is orthogonal to comparing floating point numbers. And a sime floating point comparison function _may_ belong the standard library, but not a more complex physical quantity system -Chris

On 01/14/2015 11:19 PM, Chris Barker - NOAA Federal wrote:
This is where a good but realistic example of the issue would help. Something you may actually see in a specific use case.
def is_close(a, b, unit, good): return (abs(a - b) / unit) <= good But I also think of these sort of things building as building blocks for those more complex systems. So it may help to consider how they may use these blocks.
I don't get why it helps to introduce units here. I don't think that's point (this would all be equally valid with unit less quantities).
I agree. The returned value I was thinking of should have been (abs(a - b) / b) / units <= good Or something like that.. The units here just scales the result. Without that you need to either scale the inputs, or scale the tolerance. In most cases when you are entering a tolerance, you will also be working with some multiple of units. But it seems that can confuse as much as it helps. In any case, I was just trying to come up with some practical example, and at the same time was thinking about how different units of measure can effect it.
By using objects we can do a bit more. I seem to recall coming across measurement objects some place. They keep a bit more context with them.
Yes, for the most part I agree, but think such functions would be useful in those more complex systems as well. Ron

On Jan 14, 2015, at 11:24 PM, Ron Adam <ron3200@gmail.com> wrote:
(abs(a - b) / b) / units <= good
If units and good are positive, and b is negative, this fails - got to use abs everywhere. But I'm sure both Steven and my code do that.
Without that you need to either scale the inputs, or scale the tolerance.
That's why we want a relative tolerance -- that scales the tolerance for you -- that's the whole point.
In most cases when you are entering a tolerance, you will also be working with some multiple of units. But it seems that can confuse as much as it helps.
Again, relative tolerance takes care of that for you- two values will have the same relative error whether they are stored in angstroms or meters -- the exponent will be wildly different, but the mantissas will be the same, so the relative error doesn't change. If you want to compare values In Meters According to a tolerance in angstroms, then you a) want absolute tolerance, and b) want a when units system.
In any case, I was just trying to come up with some practical example, and at the same time was thinking about how different units of measure can effect it.
Is_close(computed_value, expected_value, tolerance) Really does what you want, provided that the inputs are in the same units, and neither is zero. Tolerance is unitless. I'll take a look at Steven's code it sound like it's very much like where I was trying to go. -Chris

On Wed, Jan 14, 2015 at 08:13:42PM -0600, Ron Adam wrote:
On 01/14/2015 06:04 PM, Steven D'Aprano wrote:
[...]
No the function doesn't require a third value. It just returns the relative error between two values. What you do with that error is up to you: - compare it to an acceptable amount of error, as you do below - print it - add it to a list of error amounts etc. Of course, you could have other functions which do more, like decide whether an error is "small enough", but that is out of scope for a function which only calculates the error.
# using your implementation if relerr(actual=200.0, expected=-5.0) < 0.05: print("oops") Errors, by definition, are non-negative. You have to take care to ensure that the error value returned is non-negative. The question of which to use as the denominator is more subtle. Like you, I used to think that you should choose ahead of time which value was expected and which was actual, and divide by the actual. Or should that be the expected? I could never decide which I wanted: error relative to the expected, or error relative to the actual? And then I could never remember which order the two arguments went. Finally I read Bruce Dawson (I've already linked to his blog three or four times) and realised that he is correct and I was wrong. Error calculations should be symmetrical, so that error(a, b) == error(b, a) regardless of whether you have absolute or relative error. Furthermore, for safety you normally want the larger estimate of error, not the smaller: given the choice between (abs(a - b))/abs(a) versus (abs(a - b))/abs(b) you want the *larger* error estimate, which means the *smaller* denominator. That's the conservative way of doing it. A concrete example: given a=5 and b=7, we have: absolute error = 2 relative error (calculated relative to a) = 0.4 relative error (calculated relative to b) = 0.286 That is, b is off by 40% relative to a; or a is off by 28.6% relative to b. Or another way to put it, given that a is the "true" value, b is 40% too big; or if you prefer, 28.6% of b is in error. Whew! Percentages are hard! *wink* The conservative, "safe" way to handle this is to just treat the error function as symmetrical and always report the larger of the two relative errors (excluding the case where the denominator is 0, in which case the relative error is either 100% or it doesn't exist). Worst case, you may reject some values which you should accept, but you will never accept any values that you should reject.
Note that you would never compare to an expected value of zero.
You *cannot* compare to an expected value of zero, but you certainly can be in a situation where you would like to: math.sin(math.pi) should return 0.0, but doesn't, it returns 1.2246063538223773e-16 instead. What is the relative error of the sin function at x = math.pi?
I don't understand what you think these three examples are showing.
What Chris is looking for is a way to get a closeness function that works most of the time. (He posted while I'm writing this.)
I think the function I have in the statistics test suite is that function. I would like to see ULP calculations offered as well, but Mark thinks that's unnecessary and I'm not going to go to the battlements to fight for ULPs.
But I don't see how you can do that without specifying some scaler to give it context, and a tolerance.
By the way, it is spelled "scalar", and all that means is that it is a number, like 23, not a vector or array or matrix.
def is_close(a, b, unit, good): return (abs(a - b) / unit) <= good
Take a look at the statistics test suite. I'll be the first to admit that the error tolerances are plucked from thin air, based on what I think are "close enough", but they show how such a function might work: * you provide two values, and at least one of an absolute error tolerance and a relative error; * if the error is less than the error(s) you provided, the test passes, otherwise it fails; * NANs and INFs are handled apprpriately.
Generalise in what way?
By using objects we can do a bit more. I seem to recall coming across measurement objects some place. They keep a bit more context with them.
A full system of <value + unit> arithmetic is a *much* bigger problem than just calculating error estimates correctly, and should be a third-party library before even considering it for the std lib. -- Steve

On Wed, Jan 14, 2015 at 11:29 PM, Steven D'Aprano <steve@pearwood.info> wrote:
I'm on the fence about this -- it seems clear to me that if the user has specified an "expected" value, that tolerance would clearly be based on that magnitude. If nothing else, that is because you would more likely have many computed values for each expected value than the other way around. And I was thinking that calling the arguments something like "actual" and "expected" would make that clear in the API, and would certainly be documentable. But the fact that it was never clear to you , even as you were writing the code, is good evidence that it wouldn't be clear to everyone ;-) Error
calculations should be symmetrical, so that
error(a, b) == error(b, a)
That does make it simpler to think about and reason about, and makes the use-cased more universal (or at least appear more universal) "are these two values close" is a simple question to ask, if not to answer. regardless of whether you have absolute or relative error. Furthermore,
Which is what the Boost "strong" method does -- rather than compute teh max and use that, it computes both and does an "and" check -- but same result. A concrete example: given a=5 and b=7, we have:
The think is, in general, we use this to test for small errors, with low tolerance. Which value you use to scale only makes a big difference if the values are far apart, in which case the error will be larger than the tolerance anyway. In your above example, if the tolerance is, say, 1%, then if makes no difference which you use -- you are way off anyway. And in the common use cases, comparing a double precision floating point calculation, tolerances are more likely to be around 1e-12, not 1e-2 anyway! So I think that which relative tolerance you use makes little difference in practice, but it might as well be robust and symmetrical. (another option is to use the average of the two values to scale the tolerance, but why bother?)
there isn't one -- that's the whole point -- but there is an absolute error, so that's what you should check. We all agree a relative error involving zero is not defined / possible. So the question is what to do? 1) Raise a ValueError 2) Let it return "not close" regardless of the other input -- that's mathematically correct, nothing is relatively close to zero. 3) Automagically switch to an absolute tolerance near zero -- user specified what it should be. It seems the implementations (Boost's, for instance) I've seen simply do (2). But if the point of putting this in the standard library is that people will have something that can be used for common use cases without thinking about it, I think maybe (1) or (3) would be better. Probably (3), as raising an Exception would make a mess of this if it were inside a comprehension or something.
I'll take a look -- it does sound like you've already done pretty much what I have in mind.
I suppose it could be added later -- I agree that it could be pretty useful, but that it's also much harder to wrap your brain around, and really for a different use-case.
Is this different than the numpy implementation: http://docs.scipy.org/doc/numpy/reference/generated/numpy.allclose.html In that (according to the docs, I haven't looked at the code): """ The relative difference (*rtol* * abs(*b*)) and the absolute difference *atol* are added together to compare against the absolute difference between *a *and *b*. """ I think it should either be: - you specify atol or rtol, but only one is used. or - some way to transition from a relative tolerance to an absolute on near zero -- I a haven't figured out if that can be done smoothly yet. [Also, it looks like numpy computes the tolerance from the second input, rather than looking at both, resulting in an asymetric result -- discussed above.) I've always thought the numpy approach is weird, but now that I think about it, it would be really horrible (with defaults) for small numbers: rtol defaults to 1e-5, atol to 1e-8 -- too big, I think, but not the point here. In [23]: a, b = 1.1, 1.2 In [24]: np.allclose(a,b) Out[24]: False ## that's good there are pretty far apart In [27]: a, b = 1.1e15, 1.2e15 In [28]: np.allclose(a,b) Out[28]: False # same thing for large values -- still good. In [25]: a, b = 1.1e-15, 1.2e-15 In [26]: np.allclose(a,b) Out[26]: True OOPS! this is NOT what most people would expect!! In [30]: np.allclose(a,b, atol=0.0) Out[30]: False There we go. But with a default atol as large as 1e-8, this is a rally bad idea! I can only imagine whoever wrote this was thinking about really large values, but not really small values... (I think this has been brought up in the numpy community, but I'll make sure) -Chris
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 01/15/2015 01:29 AM, Steven D'Aprano wrote:
On Wed, Jan 14, 2015 at 08:13:42PM -0600, Ron Adam wrote:
Ewww the P word. :)
Consider two points that are a constant distance apart, but moving relative to zero. Their closeness doesn't change, but the relative error in respect to each other (and zero) does change. There is an implicit assumption that the number system used and the origin the numbers are measured from are chosen and relate to each other in some expected way. When ever you supply all the numbers, like in a test, it's not a problem, you just give good numbers.
A percentage of an expected distance. Error of two points compared to a specific distance. >>> relerr(5 - -5, 10) 0.0 I think unless you use decimal, the ulp example will either be zero or some large multiple of ulp.
Take a look at the statistics test suite.
I definitely will. :-)
I meant a function that would work in many places without giving some sort size and tolerance hints. Given two floating point numbers and noting else, I don't think you can tell if they represent something that is close without assuming some sort of context. At best, you need to assume the distance from zero and the numbers used are chosen to give a meaningful return value. While that can sometimes work, I don't think you can depend on it.
Yes, I agree. There are a few of them out there already. Cheers, Ron

On Thu, Jan 15, 2015 at 9:42 AM, Ron Adam <ron3200@gmail.com> wrote:
not sure what location two points are relative to zero means. Or if the numbers straddle zero? Good catch -- at least if the numbers are symmetric to zero -- i.e. a == -b, then they will never be "close" according to our working definition of relative tolerance: a = -b err == 2*a abs_tol = rel_tol * a #rel_tol is small and positive abs_tol is always smaller than a, so always smaller than the error, so never "close". In fact, this is true for any numbers that straddle zero: err == abs(a,b) abs_tol = rel_tol* max(abs(a), abs(b)) abs_tol <= err # provided 0< rel_tol < 1 So I was thinking the issues were near zero, but they actually are for any time: a <= 0 <= b (or b <= 0 <= a ) that is, any time they straddle or include zero. In this case, you need an absolute tolerance defined. Still figuring out how to use that and provide a smooth transition... Steven, does your code address this somehow? -Chris
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 01/15/2015 01:45 PM, Chris Barker wrote:
A number relative to zero is it's value. A number relative to another number is it's distance from the other number. If we divide that from one of the numbers, we are getting the ratio of their distance from each to the distance they are from zero. Seven suggest using the closer one as the divisor so you get a bigger result, but I prefer the larger one so I get a result between 0 and 1. A strategy for numbers of different signs may be to swap the sign on the smaller and add the smaller to the larger... err.. that isn't clear at all. How about this? def is_close(x, y, delta=.001): """ Check is x and y are close to each other. """ if x == y: return True # Can't get smaller than this. if x == 0 or y == 0: return False # Nothing is close to zero. if abs(x) < abs(y): # Make x the larger one. x, y = y, x if x * y < 0: # Signs differ. x = x - 2.0 * y # Move x and y same distance. y = -y return (x - y)/float(x) <= delta Cheers, Ron

On 01/15/2015 03:47 PM, Ron Adam wrote:
Remove the check of one being zero, it isn't needed. def is_close(x, y, delta=.001): """ Check is x and y are close to each other. """ if x == y: return True # Can't get smaller than this. if abs(x) < abs(y): # Make x the larger one. x, y = y, x if x * y < 0: # Signs differ. x = x - 2.0 * y # Move x and y same distance. y = -y return (x - y)/float(x) <= delta If one of them is zero, then you get a ratio of one. So a delta of 1 would mean everything is close. -Ron

On 01/15/2015 04:10 PM, Neil Girdhar wrote:
The problem with this is that if your "actual" is say, 5, then a large enough "estimate" will always be close.
def is_close(x, y, delta=.001): """ Check is x and y are close to each other. """ if x == y: return True # Can't get smaller than this. if abs(x) < abs(y): # Make x the larger one. x, y = y, x if x * y < 0: # Signs differ. x = x - 2.0 * y # Move x and y same distance. y = -y return (x - y)/float(x) <= delta Do you mean the y value? Some tests. is_close(0, 0, delta=0.001) --> True is_close(0, 1, delta=0.001) --> False is_close(1, 0, delta=0.001) --> False is_close(1000, 999, delta=0.001) --> True is_close(999, 1000, delta=0.001) --> True is_close(100, 95, delta=0.001) --> False is_close(95, 100, delta=0.001) --> False is_close(inf, 1, delta=0.001) --> False is_close(5, 3, delta=0.001) --> False is_close(5, 4.999, delta=0.001) --> True is_close(0, 0, delta=0.2) --> True is_close(0, 1, delta=0.2) --> False is_close(1, 0, delta=0.2) --> False is_close(1000, 999, delta=0.2) --> True is_close(999, 1000, delta=0.2) --> True is_close(100, 95, delta=0.2) --> True is_close(95, 100, delta=0.2) --> True is_close(inf, 1, delta=0.2) --> False is_close(5, 3, delta=0.2) --> False is_close(5, 4.999, delta=0.2) --> True Or do you mean this? is_close(0, 0, delta=5) --> True is_close(0, 1, delta=5) --> True is_close(1, 0, delta=5) --> True is_close(1000, 999, delta=5) --> True is_close(999, 1000, delta=5) --> True is_close(100, 95, delta=5) --> TrueBut it's probably the best you can do without adding a reference length. is_close(95, 100, delta=5) --> True is_close(inf, 1, delta=5) --> False is_close(5, 3, delta=5) --> True is_close(5, 4.999, delta=5) --> True A 1 is 100 percent the distance of the larger to 0. So every thing except infinity is smaller. A 5 is 500 percent, which is harmless as it's the same as 100 percent. The smaller of the two will be located from the larger amount and zero. It doesn't check for a negative delta, that should probably be an error. Cheers, Ron

On 01/15/2015 04:10 PM, Neil Girdhar wrote:
The problem with this is that if your "actual" is say, 5, then a large enough "estimate" will always be close.
Actually the problem is with numbers of different signs. The idea of normalising them in some way only works partially, and it doesn't fit mathematically. It seems to me if the signs are different, then relatively speaking, they are always far apart. Ron

You might be interested in my question: http://stackoverflow.com/questions/4028889/floating-point-equality-in-python On Monday, January 12, 2015 at 2:58:30 PM UTC-5, Chris Barker wrote:

On Wed, Jan 14, 2015 at 4:23 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
You might be interested in my question:
http://stackoverflow.com/questions/4028889/floating-point-equality-in-python
nothing new there, I'm afaid -- and no one seemed to have brought up the issue with zero. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

The point is that this function is already in Python and if you want to do something different, you should have a really good reason to do it differently. If you were to add a function to math, say math.close, it should work like numpy.allclose in my opinion. For reference, numpy does this: absolute(*a* - *b*) <= (*atol* + *rtol* * absolute(*b*)) where atol is an absolute tolerance and rtol is a relative tolerance (relative to the actual value b). This subsumes most of the proposals here. Best, Neil On Wed, Jan 14, 2015 at 7:48 PM, Chris Barker <chris.barker@noaa.gov> wrote:

On Thu, Jan 15, 2015 at 1:52 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
The point is that this function is already in Python
I don't think somethign being in an external package means that we have to do it the same way in teh stdlib -- even a widely used and well regarded package like numpy. And I say this as someone that has "import numpy" in maybe 90% of my python files. Maybe we should be careful to give it a very distinct name, however, to avoid confusion.
and if you want to do something different, you should have a really good reason to do it differently.
I'm not sure I agree, but we do in this case anyway. The truth is, while really smart people wrote numpy, many of the algorithms in there did not go through nearly the level of review currently required for the python standard library
adding atol in there "takes care of" the near zero and straddleing zero issue ( I suspect that's why it's done that way), but it is fatally wrong for values much less than 1.0 -- the atol totally overwhelms the rtol. See my post earlier today. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Thu, Jan 15, 2015 at 3:31 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
You can always disable atol by setting atol to zero. I really don't see what's wrong with their implementation.
1) the default should be zero in that case having a default close to the rtol default is asking for trouble. 2) if the user has both large and small numbers, there IS no appropriate value for a_tol for all of them. They simply should not be mixed in that way. -Chris
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Thu, Jan 15, 2015 at 6:36 PM, Chris Barker <chris.barker@noaa.gov> wrote:
It's not that close: rtol defaults to 1000 times bigger than atol.
2) if the user has both large and small numbers, there IS no appropriate value for a_tol for all of them.
The relative error is not symmetric, so it's not about having "large and small numbers". Your estimate should not affect the error you tolerate.

On Jan 15, 2015, at 3:47 PM, Neil Girdhar <mistersheik@gmail.com> wrote: On Thu, Jan 15, 2015 at 6:36 PM, Chris Barker <chris.barker@noaa.gov> wrote:
It's not that close: rtol defaults to 1000 times bigger than atol. That's pretty huge if your values are on order of 1e-100 ;-) Which is my point -- this approach only makes sense for values over order 1. Then the scaled relative error will be a lot larger, so atol sets a sort of lower bound on the error. But if the magnitude of your values is small, then the scaled value becomes small, and the atol overwhelms it. This kind of "switch to an absolute tolerance when close to zero" behavior is what I've been looking for, but I don't like how numpy does it. -Chris -CHB
2) if the user has both large and small numbers, there IS no appropriate value for a_tol for all of them.
The relative error is not symmetric, so it's not about having "large and small numbers". Your estimate should not affect the error you tolerate.

On Jan 15, 2015, at 3:31 PM, Neil Girdhar <mistersheik@gmail.com> wrote: absolute(*a* - *b*) <= (*atol* + *rtol* * absolute(*b*)) Oh, and if the numbers are small, then adding the absolute tolerance changes the tolerance significantly -- so you don't get what you expect there, either. Chris

You almost always want to use either an absolute tolerance or a relative tolerance. Here's why: Consider your estimated value a and your actual value b. If the estimate is taken to be the mean of a standard Gaussian distribution (the minimum assumptive distribution for a given mean and variance over the reals), then using an absolute tolerance is equivalent to verifying that the probability of observing b is within a interval with sufficient probability. Similarly, if the estimate is taken to be the mean of an standard exponential distribution (the minimum assumptive distribution for a given mean over the positive reals), then using a relative tolerance is equivalent to verifying the same thing. You almost always want one or the other. The symmetric error that people are proposing in this thread has no intuitive meaning to me. Best, Neil On Thu, Jan 15, 2015 at 6:47 PM, Chris Barker - NOAA Federal < chris.barker@noaa.gov> wrote:

On Jan 15, 2015, at 3:58 PM, Neil Girdhar <mistersheik@gmail.com> wrote: You almost always want to use either an absolute tolerance or a relative tolerance. Exactly -- only the user knows what is wanted. The trick is that relative tolerance is the most-of the time choice for float compares, but it's not appropriate (or even possible) for zero. So here we are grappling for a way to have sensible behavior at zero with otherwise relative tolerance. It's probably not possible to do so in the general case. So we should probably simply require the user to specify either and absolute or relative tolerance and be done with it. Nothing will be relatively close to zero except zero, which is fine, but should be mentioned in the docs. It sounds like this is what Steven's code does. I just haven't figured out how to look at it on a phone. -Chris Here's why: Consider your estimated value a and your actual value b. If the estimate is taken to be the mean of a standard Gaussian distribution (the minimum assumptive distribution for a given mean and variance over the reals), then using an absolute tolerance is equivalent to verifying that the probability of observing b is within a interval with sufficient probability. Similarly, if the estimate is taken to be the mean of an standard exponential distribution (the minimum assumptive distribution for a given mean over the positive reals), then using a relative tolerance is equivalent to verifying the same thing. You almost always want one or the other. The symmetric error that people are proposing in this thread has no intuitive meaning to me. Best, Neil On Thu, Jan 15, 2015 at 6:47 PM, Chris Barker - NOAA Federal < chris.barker@noaa.gov> wrote:

I'm not sure what Steven's code is, but his hypothesis: "Furthermore, for safety you normally want the larger estimate of error, not the smaller…" does not apply to the usual case where you have an actual value and an estimate. He seems to be describing a symmetric error function, which is not intuitive to me. Best, Neil On Thu, Jan 15, 2015 at 7:23 PM, Chris Barker - NOAA Federal < chris.barker@noaa.gov> wrote:

Neil Girdhar writes:
The symmetric error that people are proposing in this thread has no intuitive meaning to me.
There are many applications where the goal is to match two values, neither of which is the obvious standard (eg, statistical tests comparing populations, or even electrical circuits, where it may be important that two components be matched to within 1%, although the absolute value might be allowed to vary by up to 10%). Symmetric error is appropriate for those applications. Symmetric error may be less appropriate for applications where you want to hit an absolute value, but it's (provably) not too bad. By "provably not too bad" I mean that if you take the word "close" as a qualitative predicate, then although you can make the "distance" explode by taking the "actual" to be an order of magnitude distant in absolute units, you'll still judge it "not close" (just more so, but "more so" is meaningless in this qualitative context). On the other hand, for values that *are* close (with reasonable tolerances) it doesn't much matter which value you choose as the standard: "most" of the time you will get the "right" answer (and as the tolerance gets tighter, "most" tends to a limit of 100%). The generic "are_close()" function should be symmetric. I suppose it might also to useful to have an "is_close_to()" function that is asymmetric.

On Thu, Jan 15, 2015 at 10:28 PM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
No, if you're trying to answer the question whether two things belong to the same population as opposed to another, you should infer the population statistics based on a and b and a your estimated overall population statistics and then calculate cross entropies. Using some symmetric cross relative error has no meaning.
In statistics and machine learning at least many people have argued that the cross entropy error is the most reasonable loss function. When you have an observed value and estimated value, the right way of comparing them is a cross entropy error, and that's what absolute error and relative error are doing. They correspond to cross entropies of the minimum assumptive distributions over the reals and positive reals. I think the numpy.allclose function almost always gives you what you want when you have an actual and an estimated value, which is the more usual case.
I disagree. Since the usual case is to have an observed and estimated value, then the close function should not be symmetric. Either you should have two functions: relative error and absolute error, or you should combine them like numpy did. Best, Neil

Actually, I was wrong about the exponential distribution's KL divergence. It's the relative error (b-a)/b plus another term: log(b/a) — so I guess I don't see what relative error means except as a heuristic. Anyway, even if your symmetric error makes sense to you, does anyone already use it? If it were up to me, relative error would be b-a/b + log(b/a), but since no one uses that, I think it's a bad idea. On Thu, Jan 15, 2015 at 10:46 PM, Neil Girdhar <mistersheik@gmail.com> wrote:

On 01/16/2015 09:13 AM, Neil Girdhar wrote:
http://en.wikipedia.org/wiki/Approximation_error In this case, we are only confirming a single value is within an already determined range. In most cases that is enough to confirm the coding of the algorithm is correct, where as the more stringent methods you are thinking of is used to confirm the behaviour of the algorithm is valid. Cheers, Ron

On 1/12/2015 12:02 PM, Chris Barker wrote:
As near as I can tell, assertAlmostEqual(first, second, places=7, msg=None, delta=None) does one of two possible absolute difference checks (round(first, places) - round(second, places)) == 0.0 abs(first - second) < = delta There has been discussion of making it more complicated, but I remember that being rejected because other tests require more thought and can be implemented with the above anyway.
assertAlmostEqual((a-b)/d, 0, delta = tol) where d is a, b, and (a+b)/2 as one thinks is appropriate.
numpy provides allclose()
According to Neil Girdhar, absolute(/a/ - /b/) <= (/atol/ + /rtol/ * absolute(/b/)) which I presume means, in Python, abs(a-b) <= atol + rtol * abs(b) where atol and rtol are assume >= 0.0
(and isclose() ), which is pretty much what I'm suggesting.
Anyone else think this would be a good idea to add to the stdlib?
I am somewhat skeptical as there is no universal right answer (see below). But I think allclose is about as good as we could get, maybe with some special casing added. The discussion on the thread seems mostly divorced from the multiple use cases. What do each of a and b represent? Different numbers? or successive approximations of the same number? Are they theoretical 'exact' numbers, approximations of theoretical numbers, or calculations from measurements with error? If the latter, how big is the error? And why are we asking anyway? We are usually asking 'close enough' for some purpose, but purposes differ. Consider the problem of finding the (unique) 0 crossing (root) of a monotonic function f. One is looking for a value x* such that f(x*) is 'near' 0. (Someone claimed that 'nothing is close to zero'. This is nonsensical both in applied math and everyday life.) A standard approach is to compute successive approximations until one finds such a point. But unthinking application of such a method may not get one the desired result. The following are two examples with opposite problems. I once had the job of writing code to find the root of a monotonic function whose parameters were calculated from experimental data (multiple counts). Fortunately, it was known that the unique root (a probability) had to be between 0.0 and 1.0 (endpoints excluded). The function was often kinked (L-shaped) so that the slope at the root could be arbitrarily large. That meant that for a small enough absolute tolerance, there might be no float x such that x would be even 'near' 0, let alone equal to 0. And at other values, the slope could be so close to 0 as to make standard Newton iteration useless. I ended up avoiding float comparison by simply doing 20 steps of binary search, starting with 0 and 1 as the initial 'approximations', and stopping. The resulting 7 or 8 decimal digit approximation was more than good enough for realistic data. On the other hand, consider f(x) = x**9, Because f is so flat at its root (0) f(float) evaluates to -+0.0 for floats in a range of at least -1e-36, 1e-36. Is any number in this range 'good enough' as the root? Or is more work needed to find a value near the middle of the '= 0' range? -- Terry Jan Reedy

On Fri, Jan 16, 2015 at 11:09:16PM -0500, Terry Reedy wrote:
On 1/12/2015 12:02 PM, Chris Barker wrote:
That does nothing to solve problems A) and B), and I'm dubious that it provides a "significant figures comparison" (whatever that means, I'm pretty sure it doesn't mean "relative error", which is what you're calculating in a round-about fashion).
Adding the error tolerances together is a dubious thing to do. I don't understand the reasoning between that. Given two values a and b, there are two definitions of the error between them: absolute = abs(a - b) relative = abs(a - b)/abs(b) [Note: I'm sticking to numpy's unconditional use of "b" for the denominator, which is not symmetric. In actuality, I would use min(abs(a), abs(b)) for the denominator.] In the case where we only specify absolute or relative tolerance, it is obvious what to do: calculate the appropriate error, and if it is less than the given tolerance, return True: def approx_equal(a, b, allowed_absolute_error, allowed_relative_error): # For simplicity, ignore NANs, INFs, and assume b != 0 actual_error = abs(a - b) if allowed_absolute_error is None: # Only perform a check on relative error. return actual_error <= allowed_relative_error*abs(b) elif allowed_relative_error is None: # Only perform a check on absolute error. return actual_error <= allowed_absolute_error else: # We have specified *both* abs and rel error. How should we handle the third case? Two obvious ways come to mind: require that *both* individual tests pass: return (actual_error <= allowed_relative_error*abs(b) and actual_error <= allowed_absolute_error) # equivalent to: # actual_error <= max(allowed_relative_error*abs(b), # allowed_absolute_error) or require that *either* test pass: return (actual_relative_error <= allowed_relative_error or actual_absolute_error <= allowed_absolute_error) # equivalent to: # actual_error <= min( ... ) But what numpy does is to add the tolerances together, that is, it uses *twice* the average of them, equivalent to this: allowed_error = ( allowed_absolute_error + allowed_relative_error*abs(b) ) return actual_absolute_error <= allowed_error This means that numpy will claim that two numbers are close even though *both* the absolute and relative error tests fail: py> numpy.allclose([1.2], [1.0], 0.0, 0.1) # Fails absolute error test. False py> numpy.allclose([1.2], [1.0], 0.1, 0.0) # Fails relative error test. False py> numpy.allclose([1.2], [1.0], 0.1, 0.1) # Passes! True I cannot think of a good justification for that. Either I am missing something, or this goes to show that numpy can mess up even something as simple and straightforward as an error calculation. If I'm right, that's further evidence that getting this "right" and putting it in the standard library is a good thing to do. [...]
It isn't nonsensical, it just needs to be understood in context of relative errors. All non-zero numbers are infinitely far from zero in terms of relative error.
I didn't think that the well-known difficulties in root-finding has anything to do with the usefulness of a standard way to compare numbers for approximate equality. -- Steven

provides a "significant figures comparison" (whatever that means,
Poor choice of words on my part -- that is not a clearly defined term. I meant relative difference comparison, which does, more or less, indicate that two values are the same to within N significant figures. If you set the relative tolerance to 1e-6 then the values are "close" to about the first 6 decimal digits.
It ends up making the absolute tolerance essentially the "minimum tolerance". If the scaled relative tolerance is much larger than the absolute tolerance, it makes no difference. Likewise the other way around for small relative tolerance. It does change the result when the two are about the same magnitude, but this is all an order of magnitude thing anyway -- a factor of two doesn't really change the result. On a phone right now, but I guess that it was implemented that way to make it easy to vectorize with numpy arrays. If checks are a pain to vectorize. I think it's an OK way to do it , but very bad idea for the absolute tolerance to default to anything but zero. And really, it makes more sense to specify either absolute or relative, rather that mixing them. The numpy approach goes to heck for values smaller than zero. Essentially you need to know the scale to set the absolute tolerance anyway, killing the point of a relative tolerance.
I cannot think of a good justification for that. Either I am missing something,
See above
or this goes to show that numpy can mess up even something as simple and straightforward as an error calculation.
I'm not going to say that it messed up--but I certainly don't think that we should feel any obligation to do it the same way.
I do think this takes some thought, so good to put it in. But this thread also makes it clear that there a lot of ways to define "close", and which is best is use-case dependent. I think a straightforward relative tolerance would be usefull enough in many cases, and maybe an absolute one would be good for completeness. But I can see the argument that if there isn't one available, then folks are forced to think for themselves about what "closeness" means in their case, and be les likely to simply use the one provided when it might not be appropriate. (I need to go back and check some of my tests, for instance, I'm sure I didn't always think about the numpy abs_ tol parameter carefully!)
Agreed -- termination criteria of numerical methods is not the use case for a generic is_close implementation. That clearly requires a use-case specific criteria.

OK, I FINALLY got a chance to look at Steven's code in the statistic module tests. Not much code there, this really isn't hat big a deal. It does check for NaN, and inf and all that, so that's good. It is also symmetric with respect to x and y -- using the maximum of the two to compute the relative error -- I think that's good. (This is essentially the same as Boosts "strong" method -- though implemented a tiny bit differently). Here is the key definition: def approx_equal(x, y, tol=1e-12, rel=1e-7): ... x is approximately equal to y if the difference between them is less than an absolute error tol or a relative error rel, whichever is bigger. ... This is a lot like the numpy code, actually, except it does a max test, rather than adding the absolute and relative tolerances together. I think this is a better way to go than numpy's but there is little practical difference. However, it suffers from the same issue -- "tol" is essentially a minimum error that is considered acceptable. This is nice, as it it will allow zero to be passed in, and if the other input is within tol of zero, it will be considered approximately equal. However, for very small numbers (less that the absolute tolerance), then they will always be considered approximately equal: In [18]: approx_equal(1.0e-14, 2.0e-14) Out[18]: True off by a factor of 2 In [19]: approx_equal(1.0e-20, 2.0e-25) Out[19]: True oops! way off! This is with the defaults of course, and all you need to do is set teh tol much lower: In [20]: approx_equal(1.0e-20, 2.0e-25, tol=1e-25) Out[20]: False This is less fatal than with numpy, as with numpy you are processing a whole array of numbers with the same tolerances, and they may not be all of the same magnitude. But I think think it's trap for users. My proposal: Allow either an absolute or relative tolerance, but not try to do both in one call. or If you really want the ability to do both at once (i.e. set a minimum for the zero case), then: - make the default absolute tolerance zero -- fewer surprises that way - document the absolute tolerance as a mimimum error (difference), and specifically mention the zero case in the docs. Otherwise, go with Steven's code, and put it in the math module. Also -- there was some talk of what do do with complex -- I say two complex numbers are approx_equal if approx_equal(z1.real, z2.real) and approx_equal(z1.imag, z2.imag) -- that is more rigorous a test than using the complex abs value of the difference. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
participants (14)
-
Alan Cristhian Ruiz
-
Andrew Barnert
-
Antoine Pitrou
-
Chris Barker
-
Chris Barker - NOAA Federal
-
Ethan Furman
-
Guido van Rossum
-
Mark Dickinson
-
Nathaniel Smith
-
Neil Girdhar
-
Ron Adam
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Terry Reedy