On Wed, Feb 7, 2018 at 12:09 AM, Ralf Gommers wrote:
It is partly a plea for some development of numerically accurate functions for computing lat/lon grids from a combination of inputs: bounds, counts, and resolutions.

Can you be more specific about what problems you've run into -- I work with lat-lon grids all the time, and have never had a problem.

float32 degrees gives you about 1 meter accuracy or better, so I can see how losing a few digits might be an issue, though I would argue that you maybe shouldn't use float32 if you are worried about anything close to 1m accuracy... -- or shift to a relative coordinate system of some sort.

I have been playing around with the decimal package a bit lately,

sigh. decimal is so often looked at a solution to a problem it isn't designed for. lat-lon is natively Sexagesimal -- maybe we need that dtype :-)

what you get from decimal is variable precision -- maybe a binary variable precision lib is a better answer -- that would be a good thing to have easy access to in numpy, but in this case, if you want better accuracy in a computation that will end up in float32, just use float64.

and I discovered the concept of "fused multiply-add" operations for improved accuracy. I have come to realize that fma operations could be used to greatly improve the accuracy of linspace() and arange().

arange() is problematic for non-integer use anyway, by its very definition (getting the "end point" correct requires the right step, even without FP error).

and would it really help with linspace? it's computing a delta with one division in fp, then multiplying it by an integer (represented in fp -- why? why not keep that an integer till the multiply?).

In particular, I have been needing improved results for computing latitude/longitude grids, which tend to be done in float32's to save memory (at least, this is true in data I come across).

If you care about saving memory *and* accuracy, wouldn't it make more sense to do your computations in float64, and convert to float32 at the end?

that does seem to be the easy option :-)

Now, to the crux of my problem. It is next to impossible to generate a non-trivial numpy array of coordinates, even in double precision, without hitting significant numerical errors.

I'm confused, the example you posted doesn't have significant errors...

Which has lead me down the path of using the decimal package (which doesn't play very nicely with numpy because of the lack of casting rules for it). Consider the following:
```
\$ cat test_fma.py
from __future__ import print_function
import numpy as np
res = np.float32(0.01)
cnt = 7001
x0 = np.float32(-115.0)
x1 = res * cnt + x0
print("res * cnt + x0 = %.16f" % x1)
x = np.arange(-115.0, -44.99 + (res / 2), 0.01, dtype='float32')
print("len(arange()): %d  arange()[-1]: %16f" % (len(x), x[-1]))
x = np.linspace(-115.0, -44.99, cnt, dtype='float32')
print("linspace()[-1]: %.16f" % x[-1])

\$ python test_fma.py
res * cnt + x0 = -44.9900015648454428
len(arange()): 7002  arange()[-1]:       -44.975044
linspace()[-1]: -44.9900016784667969
```
arange just produces silly results (puts out an extra element... adding half of the resolution is typically mentioned as a solution on mailing lists to get around arange()'s limitations -- I personally don't do this).

The real solution is "don't do that" arange is not the right tool for the job.

Then there is this:

res * cnt + x0 = -44.9900015648454428
linspace()[-1]: -44.9900016784667969

that's as good as you are ever going to get with 32 bit floats...

Though I just noticed something about your numbers -- there should be a nice even base ten delta if you have 7001 gaps -- but linspace produces N points, not N gaps -- so maybe you want:

In [17]: l = np.linspace(-115.0, -44.99, 7002)

In [18]: l[:5]

Out[18]: array([-115.  , -114.99, -114.98, -114.97, -114.96])

In [19]: l[-5:]

Out[19]: array([-45.03, -45.02, -45.01, -45.  , -44.99])

or, in float32 -- not as pretty:

In [20]: l = np.linspace(-115.0, -44.99, 7002, dtype=np.float32)

In [21]: l[:5]

Out[21]:

array([-115.        , -114.98999786, -114.98000336, -114.97000122,

-114.95999908], dtype=float32)

In [22]: l[-5:]

Out[22]: array([-45.02999878, -45.02000046, -45.00999832, -45.        , -44.99000168], dtype=float32)

but still as good as you get with float32, and exactly the same result as computing in float64 and converting:

In [25]: l = np.linspace(-115.0, -44.99, 7002).astype(np.float32)

In [26]: l[:5]

Out[26]:

array([-115.        , -114.98999786, -114.98000336, -114.97000122,

-114.95999908], dtype=float32)

In [27]: l[-5:]

Out[27]: array([-45.02999878, -45.02000046, -45.00999832, -45.        , -44.99000168], dtype=float32)

So, does it make any sense to improve arange by utilizing fma() under the hood?

no -- this is simply not the right use-case for arange() anyway.

Also, any plans for making fma() available as a ufunc?

could be nice -- especially if used internally.

Notice that most of my examples required knowing the number of grid points ahead of time. But what if I didn't know that? What if I just have the bounds and the resolution? Then arange() is the natural fit, but as I showed, its accuracy is lacking, and you have to do some sort of hack to do a closed interval.

no -- it's not -- if you have the bounds and the resolution, you have an over-specified problem. That is:

x_min + (n * delta_x) == x_max

If there is ANY error in either delta_x or x_max (or x_min), then you'll get a missmatch. which is why arange is not the answer (you can make the algorithm a bit more accurate, I suppose but there is still fp limited precision -- if you can't exactly represent either delta_x or x_max, then you CAN'T use the arange() definition and expect to work consistently.

The "right" way to do it is to compute N with: round((x_max - x_min) / delta), and then use linspace:

linspace(x_min, x_max, N+1)

(note that it's too bad you need to do N+1 -- if I had to do it over again, I'd use N as the number of "gaps" rather than the number of points -- that's more commonly what people want, if they care at all)

This way, you get a grid with the endpoints as exact as they can be, and the deltas as close to each-other as they can be as well.

maybe you can do a better algorithm in linspace to save an ULP, but it's hard to imagine when that would matter.

-CHB

--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker@noaa.gov