Re: [Numpy-discussion] improving arange()? introducing fma()?

11 Feb 2018

      A slightly longer comment although I have to admit that it misses the
discussion here partially. Still I hope it provides some background to your
question.

Generating equidistantly spaced grids is simply not always possible. The
reason is that the absolute spacing of the possible floating point numbers
depends on their magnitude [1]. The true, equidistant grid may not be
representable in the precision that is used. Consequently, the spacing
between consecutive points in a grid created with linspace can vary:

x = np.linspace(-50000.0, 50000.0, 1000001) # dx = 0.1
np.min(np.diff(x))
...
0.099999999991268851
np.max(np.diff(x))
0.10000000000582077
This does not necessarily mean that the calculation of the grid points is
off by more than one ulp. It simply shows that the absolute value of one
ulp changes over the extent of the grid.
It follows that even if the grid is calculated with higher precision (e.g.
fma operations) or even arbitrary precision, we will hit a fundamental
boundary: the finite precision floating point numbers closest to the "true"
grid simply are not equidistantly spaced.
My estimate is that the grid spacing is at most off by ~N ulp. This should
be far too small to have any consequences in calculations based on the grid
points themselves.
However, if you write code that reconstructs the grid spacing from
consecutive grid points, e.g. for scaling the value of an integral, you may
notice it. I did when writing a unit test for Fourier transforms where I
compared to analytical results. Take home message: use dx = (x[-1] -
x[0])/(N-1) instead of dx = x[1] - x[0].

If you - for some reason - want the same grid spacing everywhere you may
choose an appropriate new spacing. What generally seems to work (but not
always) is one that fits the floating point spacing at the largest number
in the grid. In other words add and subtract the largest number in your
grid to the grid spacing:

dx = 0.1
N = 1000001
x0 = -50000.0

# "round" dx, assuming that the maximal magnitude of the grid is ~N*dx
dx = (dx + N * dx) - (N * dx)
...
0.10000000000582077
x = x0 + np.arange(N) * dx
np.max(np.diff(x)) - np.min(np.diff(x))
0.0
Curiosly, either by design or accident, arange() seems to do something
similar as was mentioned by Eric. It creates a new grid spacing by adding
and subtracting the starting point of the grid. This often has similar
effect as adding and subtracting N*dx (e.g. if the grid is symmetric around
0.0). Consequently, arange() seems to trade keeping the grid spacing
constant for a larger error in the grid size and consequently in the end
point.

The above discussion somewhat ignored additional errors in the calculation
of the grid points. If every grid point is the result of one multiplication
and one addition, they should stay within one ulp (for x[i] != 0.0). If the
grid is calculated by accumulation they may be larger.

If you are concerned about how arange() or linspace() calculates the grid,
I would recommend to simply use

x = x0 + np.arange(N) * dx

which will yield an accurate representation of the grid you most probably
want (within 1 ulp for x[i] != 0.0). If you do not have x0, N, and dx, try
to infer them from other information on the grid.

To sum this all up: Even if your grid calculation is free of round-off
error, the resulting grid using finite precision floating point numbers
will only be exactly equidistant for appropriate grid spacing dx and
starting values x0. On the other hand neither this effect nor round-off in
the grid calculation should ever play a signification role in your
calculations as the resulting errors are too small.

Some additional comments:

1. Comparison to calculations with decimal can be difficult as not all
simple decimal step sizes are exactly representable as finite floating
point numbers. E.g., in single precision 0.1000000014901161193847656250 is
the number closest to 0.1. To make a good comparison you would have to
compare the output of arange() to a grid calculated to infinite precision
which has the single precision representation of 0.1 as grid spacing and
not its decimal value.

2. Calculating the difference in single precision in your C code as below
yields

    float actual = -44.990000000000000;
    float err1 = fabs(x1 - actual);
    float err2 = fabs(x2 - actual);
$ ./test_fma
x1 -44.9899978637695312  x2 -44.9900016784667969
err1 3.81470e-06  err2 0.00000e+00

If you calculate the ulp error you notice that the non-FMA solution is off
by exactly one ulp, while the FMA solution yields the single precision
float closest to the actual solution (error = 0 ulp). Losing one ulp should
not matter for any application. If it does, use a data type with higher
precision, e.g., double.

3. I wrote Python script that compares grids generated in different ways
with a grid calculated with mpmath in high precision [2]. Maybe it is
useful for further insights.

Cheers
Nils

PS: I realize that I went a little off-topic here and apologize for that.
However, a while ago I spent some time trying to figure this issue out and
thought that this may be an occasion to write some of it down.

[1] http://www.exploringbinary.com/the-spacing-of-binary-flo
ating-point-numbers/
[2] https://ncbecker.de/linear_grid.py

Re: [Numpy-discussion] improving arange()? introducing fma()?

Nils Becker