On Fri, Feb 9, 2018 at 1:16 PM, Matthew Harrigan <harrigan.matthew@gmail.com> wrote:
I apologize if I'm missing something basic, but why are floats being accumulated in the first place?  Can't arange and linspace operations with floats be done internally similar to `start + np.arange(num_steps) * step_size`?  I.e. always accumulate (really increment) integers to limit errors.

I haven't looked at the arange() code, but linspace does does not accumulate floats -- which is why it's already almost as good as it can be. As regards to a fused-multiply-add, it does have to do a single multiply_add operation for each value (as per your example code), so we may be able to save a ULP there. 

The problem with arange() is that the definition is poorly specified:

start + (step_num * step) while value < stop.

Even without fp issues, it's weird if (stop - start) / step is not an integer. -- the "final" step will not be the same as the rest.

Say you want a "grid" with fully integer values. if the step is just right, all is easy:

In [72]: np.arange(0, 11, 2)

Out[72]: array([ 0,  2,  4,  6,  8, 10])


(this is assuming you want 10 as the end point.

but then:

In [73]: np.arange(0, 11, 3)

Out[73]: array([0, 3, 6, 9])


but I wanted 10 as an end point. so:

In [74]: np.arange(0, 13, 3)

Out[74]: array([ 0,  3,  6,  9, 12])


hmm, that's not right either. Of course it's not -- you can't get 10 as an end point, 'cause it's not a multiple of the step. With integers, you CAN require that the end point be a multiple of the step, but with fp, you can't required that it be EXACTLY a multiple, because either the end point or the step may not be exactly representable, even if you do the math with no loss of precision. And now you are stuck with the user figuring out for themselves whether the closest fp representation of the end point is slightly larger or smaller than the real value, so the < check will work. NOT good.

This is why arange is simply not the tool to use.

Making a grid, you usually want to specify the end points and the number of steps which is almost what linspace does. Or, _maybe_ you want to specify the step and the number of steps, and accept that the end point may not be exactly what you "expect". There is no built-in function for this in numpy. maybe there should be, but it's pretty easy to write, as you show above.

Anyone that understands FP better than I do:

In the above code, you are multiplying the step by an integer -- is there any precision loss when you do that??

-CHB


--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker@noaa.gov