On Wed, Jan 14, 2015 at 1:08 AM, Ron Adam
Significant digits has more to do with error of measurements, (and estimates),
right -- and relative tolerance is kinda-sorta like significant digits. i.e a relative tolerance of 1e-4 is like saying the same to 4 (decimal) significant digits. which brings up the issue with 0.0 -- how many significant digits does 0.0 have? does 0.000001234 have the same significant digits as 0.0 ? -- its not really defined. whereas it's fairly straightforward to say that: 0.000001234 and 0.000001233 are the same to 3 significant digits, but differ in the forth. and note that: In [46]: u, v = 0.000001234, 0.000001233 In [47]: err = abs(u-v) In [48]: err Out[48]: 9.999999999998634e-10 so the absolute error is less than 1e-9 == pretty "small" -- but is that what we generally want? no. In [49]: err <= 1e-3*abs(u) Out[49]: True but the error is less than a relative tolerance of 1e-3 (three sig figs) In [50]: err <= 1e-4*abs(v) Out[50]: False and greater than a relative tolerance of 1e-4 (not four sig figs) Again, this all works great when you are away from zero (and maybe need to handle NaN and inf carefully too), but if we simply put this in the stdlib, then folks may use it with a zero value, and not get what they expect. My thinking now: set a "zero_tolerance",which will default to the relative tolerance, but be user-settable. If one of the input values is zero, then use the zero_tolerance as an absolute tolerance, if not, then use a relative tolerance. I think this would make for fewer surprises, and make it easier to use the same function call for a wide range of values, some of which may be zero. What I haven't figure out yet is how(or if) to make sure that the transition is continuous -- do we only use zero_tol if one of the values is exactly zero? of if one or both of the values is less than zero_tol? It seems that if you say, for instance that: 1e-12 is "close" to zero, then 1e-12 should also be "close" to any value less than 1e-12. But if 1e-12 is "close" to 1e-14 (for example), then 1e-12 should probably be "close" to 1.00000000001 also, but it wouldn't be, if we did an abrupt change to relative tolerance for any value >= the zero_tolerance. So more to think out here -- feel free to chime in. I've been playing with this gist, though it's only a few real lines of code anyway: https://gist.github.com/PythonCHB/6e9ef7732a9074d9337a
while ULPs is about accuracy limits of the hardware/software ability to calculate.
Second, there's never any problem determining if two finite numbers are
equal or which one is greater. The issue is determining whether two numbers ± their error bars are too close to call vs. unambiguously greater or lesser. For example, if I get 1.0e1 and 1.1e1 (assuming 1 bit mantissa for ease of discussion), the latter is clearly greater--but if I have 2 ulp of error, or an absolute error of 1e1, or a relative error of 50%, the fact that the latter is greater is irrelevant--each value is within the other value's error range.
You would use the larger of the three. And possibly give a warning if the 2 ulp error is the largest. (if the application is set to do so.)
I presuming the 2 ulp is twice the limit of the floating point precision here.
50% accuracy of data 1e1 limit of significant digits/measurement 2 ulp twice floating point unit of least precision
But if I get 1.0e1 and 1.0e10 with
the same error, then I can say that the latter is unambiguously greater.
Yes, This was the point I was alluding to earlier.
I think Chris is right that ulp comparisons usually only come up in
testing an algorithm. You have to actually do an error analysis, and you have to have input data with a precision specified in ulp, or you're not going to get a tolerance in ulp. When you want to verify that you've correctly implemented an algorithm that guarantees to multiply input ulp by no more than 4, you can feed in numbers with ± 1 ulp error (just typing in decimal numbers does that) and verify that the results are within ± 4 ulp. But when you have real data, or inherent rounding issues, or an error analysis that's partly made up of rules of thumb and winging it, you're almost always going to end up with absolute or relative error instead. (Or, occasionally, something more complicated that you have to code up man ually, like logarithmic relative error.)
If the algorithm doesn't track error accumulation, then yes.
This is interesting but I'm going to search for some examples of how to use some of this. I'm not sure I can add to the conversation much, but thanks for taking the time to explain some of it.
Cheers, Ron
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov