Can’t arange and linspace operations with floats be done internally

Yes, and they probably should be - they’re done this way as a hack because the api exposed for custom dtypes is here, (example implementation here) - essentially, you give it the first two elements of the array, and ask it to fill in the rest.


On Fri, 9 Feb 2018 at 13:17 Matthew Harrigan <harrigan.matthew@gmail.com> wrote:
I apologize if I'm missing something basic, but why are floats being accumulated in the first place?  Can't arange and linspace operations with floats be done internally similar to `start + np.arange(num_steps) * step_size`?  I.e. always accumulate (really increment) integers to limit errors.

On Fri, Feb 9, 2018 at 3:43 PM, Benjamin Root <ben.v.root@gmail.com> wrote:


On Fri, Feb 9, 2018 at 12:19 PM, Chris Barker <chris.barker@noaa.gov> wrote:
On Wed, Feb 7, 2018 at 12:09 AM, Ralf Gommers <ralf.gommers@gmail.com> wrote:
 It is partly a plea for some development of numerically accurate functions for computing lat/lon grids from a combination of inputs: bounds, counts, and resolutions.

Can you be more specific about what problems you've run into -- I work with lat-lon grids all the time, and have never had a problem.

float32 degrees gives you about 1 meter accuracy or better, so I can see how losing a few digits might be an issue, though I would argue that you maybe shouldn't use float32 if you are worried about anything close to 1m accuracy... -- or shift to a relative coordinate system of some sort.

The issue isn't so much the accuracy of the coordinates themselves. I am only worried about 1km resolution (which is approximately 0.01 degrees at mid-latitudes). My concern is with consistent *construction* of a coordinate grid with even spacing. As it stands right now. If I provide a corner coordinate, a resolution, and the number of pixels, the result is not terrible (indeed, this is the approach used by gdal/rasterio). If I have start/end coordinates and the number of pixels, the result is not bad, either (use linspace). But, if I have start/end coordinates and a resolution, then determining the number of pixels from that is actually tricky to get right in the general case, especially with float32 and large grids, and especially if the bounding box specified isn't exactly divisible by the resolution.
 

I have been playing around with the decimal package a bit lately,

sigh. decimal is so often looked at a solution to a problem it isn't designed for. lat-lon is natively Sexagesimal -- maybe we need that dtype :-)

what you get from decimal is variable precision -- maybe a binary variable precision lib is a better answer -- that would be a good thing to have easy access to in numpy, but in this case, if you want better accuracy in a computation that will end up in float32, just use float64.

I am not concerned about computing distances or anything like that, I am trying to properly construct my grid. I need consistent results regardless of which way the grid is specified (start/end/count, start/res/count, start/end/res). I have found that loading up the grid specs (using in a config file or command-line) using the Decimal class allows me to exactly and consistently represent the grid specification, and gets me most of the way there. But the problems with arange() is frustrating, and I have to have extra logic to go around that and over to linspace() instead.
 

and I discovered the concept of "fused multiply-add" operations for improved accuracy. I have come to realize that fma operations could be used to greatly improve the accuracy of linspace() and arange().

arange() is problematic for non-integer use anyway, by its very definition (getting the "end point" correct requires the right step, even without FP error).

and would it really help with linspace? it's computing a delta with one division in fp, then multiplying it by an integer (represented in fp -- why? why not keep that an integer till the multiply?).

Sorry, that was a left-over from a previous draft of my email after I discovered that linspace's accuracy was on par with fma(). And while arange() has inherent problems, it can still be made better than it is now. In fact, I haven't investigated this, but I did recently discover some unit tests of mine started to fail after a numpy upgrade, and traced it back to a reduction in the accuracy of a usage of arange() with float32s. So, something got worse at some point, which means we could still get accuracy back if we can figure out what changed.
 

In particular, I have been needing improved results for computing latitude/longitude grids, which tend to be done in float32's to save memory (at least, this is true in data I come across).

If you care about saving memory *and* accuracy, wouldn't it make more sense to do your computations in float64, and convert to float32 at the end?

that does seem to be the easy option :-)

Kinda missing the point, isn't it? Isn't that like saying "convert all your data to float64s prior to calling np.mean()"? That's ridiculous. Instead, we made np.mean() upcast the inner-loop operation, and even allow an option to specify what the dtype that should be used for the aggregator.
 
 
Now, to the crux of my problem. It is next to impossible to generate a non-trivial numpy array of coordinates, even in double precision, without hitting significant numerical errors.

I'm confused, the example you posted doesn't have significant errors...

Hmm, "errors" was the wrong word. "Differences between methods" might be more along the lines of what I was thinking. Remember, I am looking for consistency.
 
 
Which has lead me down the path of using the decimal package (which doesn't play very nicely with numpy because of the lack of casting rules for it). Consider the following:
```
$ cat test_fma.py
from __future__ import print_function
import numpy as np
res = np.float32(0.01)
cnt = 7001
x0 = np.float32(-115.0)
x1 = res * cnt + x0
print("res * cnt + x0 = %.16f" % x1)
x = np.arange(-115.0, -44.99 + (res / 2), 0.01, dtype='float32')
print("len(arange()): %d  arange()[-1]: %16f" % (len(x), x[-1]))
x = np.linspace(-115.0, -44.99, cnt, dtype='float32')
print("linspace()[-1]: %.16f" % x[-1])

$ python test_fma.py
res * cnt + x0 = -44.9900015648454428
len(arange()): 7002  arange()[-1]:       -44.975044
linspace()[-1]: -44.9900016784667969
```
arange just produces silly results (puts out an extra element... adding half of the resolution is typically mentioned as a solution on mailing lists to get around arange()'s limitations -- I personally don't do this).

The real solution is "don't do that" arange is not the right tool for the job.

Well, it isn't the right tool because as far as I am concerned, it is useless for anything but integers. Why not fix it to be more suitable for floating point?
 

Then there is this:

res * cnt + x0 = -44.9900015648454428
linspace()[-1]: -44.9900016784667969

that's as good as you are ever going to get with 32 bit floats...

Consistency is the key thing. I am fine with one of those values, so long as that value is what happens no matter which way I specify my grid.
 

Though I just noticed something about your numbers -- there should be a nice even base ten delta if you have 7001 gaps -- but linspace produces N points, not N gaps -- so maybe you want:


In [17]: l = np.linspace(-115.0, -44.99, 7002)


In [18]: l[:5]

Out[18]: array([-115.  , -114.99, -114.98, -114.97, -114.96])


In [19]: l[-5:]

Out[19]: array([-45.03, -45.02, -45.01, -45.  , -44.99])


or, in float32 -- not as pretty:


In [20]: l = np.linspace(-115.0, -44.99, 7002, dtype=np.float32)


In [21]: l[:5]

Out[21]: 

array([-115.        , -114.98999786, -114.98000336, -114.97000122,

       -114.95999908], dtype=float32)


In [22]: l[-5:]

Out[22]: array([-45.02999878, -45.02000046, -45.00999832, -45.        , -44.99000168], dtype=float32)


but still as good as you get with float32, and exactly the same result as computing in float64 and converting:



In [25]: l = np.linspace(-115.0, -44.99, 7002).astype(np.float32)


In [26]: l[:5]

Out[26]: 

array([-115.        , -114.98999786, -114.98000336, -114.97000122,

       -114.95999908], dtype=float32)


In [27]: l[-5:]

Out[27]: array([-45.02999878, -45.02000046, -45.00999832, -45.        , -44.99000168], dtype=float32)


Argh! I got myself mixed up between specifying pixel corners versus pixel centers. rasterio has been messing me up on this.
 



So, does it make any sense to improve arange by utilizing fma() under the hood?

no -- this is simply not the right use-case for arange() anyway.

arange() has accuracy problems, so why not fix it?

>>> l4 = np.arange(-115, -44.99, 0.01, dtype=np.float32)
>>> np.median(np.diff(l4))
0.0099945068
>>> np.float32(0.01)
0.0099999998

There is something significantly wrong here if arange(), which takes a resolution parameter, can't seem to produce a sequence with the proper delta.

 
 
Also, any plans for making fma() available as a ufunc?

could be nice -- especially if used internally.
 
Notice that most of my examples required knowing the number of grid points ahead of time. But what if I didn't know that? What if I just have the bounds and the resolution? Then arange() is the natural fit, but as I showed, its accuracy is lacking, and you have to do some sort of hack to do a closed interval.

no -- it's not -- if you have the bounds and the resolution, you have an over-specified problem. That is:

x_min + (n * delta_x) == x_max

If there is ANY error in either delta_x or x_max (or x_min), then you'll get a missmatch. which is why arange is not the answer (you can make the algorithm a bit more accurate, I suppose but there is still fp limited precision -- if you can't exactly represent either delta_x or x_max, then you CAN'T use the arange() definition and expect to work consistently.

The "right" way to do it is to compute N with: round((x_max - x_min) / delta), and then use linspace:

linspace(x_min, x_max, N+1)

(note that it's too bad you need to do N+1 -- if I had to do it over again, I'd use N as the number of "gaps" rather than the number of points -- that's more commonly what people want, if they care at all)

This way, you get a grid with the endpoints as exact as they can be, and the deltas as close to each-other as they can be as well.

maybe you can do a better algorithm in linspace to save an ULP, but it's hard to imagine when that would matter.

Yes, it is overspecified. My problem is that different tools require different specs (ahem... rasterio/gdal), and I have gird specs coming from other sources. And I need to produce data onto the same grid so that tools like xarray won't yell at me when I am trying to do an operation between gridded data that should have the same coordinates, but are off slightly because they were computed differently for whatever reason.

I guess I am crying out for some sort of tool that will help the community stop making the same mistakes. A one-stop shop that'll allow us to specify a grid in a few different ways and still produce the right thing, and even do the inverse... provide a coordinate array and get grids specs in whatever form we want. Maybe even have options for dealing with pixel corner vs. pixel centers, too? There are additional fun problems such as padding out coordinate arrays, which np.pad doesn't really do a great job with.

Cheers!
Ben Root

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion