Hi there, One of our users just found a bug in numpy that has to do with casting. Consider the attached example. The difference at the end should be 0 (zero) everywhere. But it's not by default. Casting the data to 'float64' at reading and assiging to the arrays works Defining the arrays "at creation time" as "float32" works But simply reading in the data (flaot32) and putting them into the default arrays (float64) leads to differences! Thanks for looking into this, C.
2009/5/26 Charles سمير Doutriaux <doutriaux1@llnl.gov>:
Hi there,
One of our users just found a bug in numpy that has to do with casting.
Consider the attached example.
The difference at the end should be 0 (zero) everywhere.
But it's not by default.
Casting the data to 'float64' at reading and assiging to the arrays works Defining the arrays "at creation time" as "float32" works
But simply reading in the data (flaot32) and putting them into the default arrays (float64) leads to differences!
That is probably not a good characterization of the code. You are doing a summation of squares of floats two different ways. The data is stored in the file with dtype=float32. You are reading in the data a "row" at a time (with the name "tmp" for future reference). You are keeping an accumulator ("ex2" with dtype=float64) and storing each row into an array shaped like the whole data in the file ("a" with dtype=float64). You calculate the sum of squares one way by "ex2 = ex2 + power(tmp, 2.0)". In this case, the square is computed, then stuffed back into a float32 intermediate result, *then* added to "ex2". You calculate the sum of squares the other way by "numpy.sum(power(a, 2.0))". In this case, the square is computed in full float64 precision, then added up. The difference between the two results is just that of using a low precision intermediate for the square. You have fairly large inputs so you are losing a good number of digits by using a float32 intermediate before the summation. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Thanks Robert, I thought it was something like that but couldn't figure it out. C. On May 26, 2009, at 4:50 PM, Robert Kern wrote:
2009/5/26 Charles سمير Doutriaux <doutriaux1@llnl.gov>:
Hi there,
One of our users just found a bug in numpy that has to do with casting.
Consider the attached example.
The difference at the end should be 0 (zero) everywhere.
But it's not by default.
Casting the data to 'float64' at reading and assiging to the arrays works Defining the arrays "at creation time" as "float32" works
But simply reading in the data (flaot32) and putting them into the default arrays (float64) leads to differences!
That is probably not a good characterization of the code. You are doing a summation of squares of floats two different ways. The data is stored in the file with dtype=float32. You are reading in the data a "row" at a time (with the name "tmp" for future reference). You are keeping an accumulator ("ex2" with dtype=float64) and storing each row into an array shaped like the whole data in the file ("a" with dtype=float64).
You calculate the sum of squares one way by "ex2 = ex2 + power(tmp, 2.0)". In this case, the square is computed, then stuffed back into a float32 intermediate result, *then* added to "ex2".
You calculate the sum of squares the other way by "numpy.sum(power(a, 2.0))". In this case, the square is computed in full float64 precision, then added up.
The difference between the two results is just that of using a low precision intermediate for the square. You have fairly large inputs so you are losing a good number of digits by using a float32 intermediate before the summation.
-- Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http:// mail.scipy.org/mailman/listinfo/numpy-discussion
participants (2)
-
Charles سمير Doutriaux
-
Robert Kern