Importance of order when summing values in an array
data:image/s3,"s3://crabby-images/e0c0d/e0c0d5e1d377265ddb082fa432cbad5b41ef5388" alt=""
Hi All, I have encountered a puzzling issue and I am not certain if this is a mistake of my own doing or not. Would someone kindly just look over this issue to make sure I'm not doing something very silly. So, why would the sum of an array have a different value depending on the order I select the indices of the array?
vector[[39, 46, 49, 50, 6, 9, 12, 14, 15, 17, 21]].sum() 8933281.8757099733 vector[[6, 9, 12, 14, 15, 17, 21, 39, 46, 49, 50]].sum() 8933281.8757099714 sum(vector[[39, 46, 49, 50, 6, 9, 12, 14, 15, 17, 21]]) 8933281.8757099733 sum(vector[[6, 9, 12, 14, 15, 17, 21, 39, 46, 49, 50]]) 8933281.8757099714
Any thoughts? Cheers, Hanni
data:image/s3,"s3://crabby-images/fbc0b/fbc0b3c618c5dd2899e426323763dcb5ea5511fa" alt=""
The highest accuracy is obtained when you sum an acceding ordered series, and the lowest accuracy with descending ordered. In between you might get a variety of rounding errors. Nadav. -----הודעה מקורית----- מאת: numpy-discussion-bounces@scipy.org בשם Hanni Ali נשלח: ג 09-דצמבר-08 16:07 אל: Discussion of Numerical Python נושא: [Numpy-discussion] Importance of order when summing values in anarray Hi All, I have encountered a puzzling issue and I am not certain if this is a mistake of my own doing or not. Would someone kindly just look over this issue to make sure I'm not doing something very silly. So, why would the sum of an array have a different value depending on the order I select the indices of the array?
vector[[39, 46, 49, 50, 6, 9, 12, 14, 15, 17, 21]].sum() 8933281.8757099733 vector[[6, 9, 12, 14, 15, 17, 21, 39, 46, 49, 50]].sum() 8933281.8757099714 sum(vector[[39, 46, 49, 50, 6, 9, 12, 14, 15, 17, 21]]) 8933281.8757099733 sum(vector[[6, 9, 12, 14, 15, 17, 21, 39, 46, 49, 50]]) 8933281.8757099714
Any thoughts? Cheers, Hanni
data:image/s3,"s3://crabby-images/e0c0d/e0c0d5e1d377265ddb082fa432cbad5b41ef5388" alt=""
Thank you Nadav. 2008/12/9 Nadav Horesh <nadavh@visionsense.com>
The highest accuracy is obtained when you sum an acceding ordered series, and the lowest accuracy with descending ordered. In between you might get a variety of rounding errors.
Nadav.
-----הודעה מקורית----- מאת: numpy-discussion-bounces@scipy.org בשם Hanni Ali נשלח: ג 09-דצמבר-08 16:07 אל: Discussion of Numerical Python נושא: [Numpy-discussion] Importance of order when summing values in anarray
Hi All,
I have encountered a puzzling issue and I am not certain if this is a mistake of my own doing or not. Would someone kindly just look over this issue to make sure I'm not doing something very silly.
So, why would the sum of an array have a different value depending on the order I select the indices of the array?
vector[[39, 46, 49, 50, 6, 9, 12, 14, 15, 17, 21]].sum() 8933281.8757099733 vector[[6, 9, 12, 14, 15, 17, 21, 39, 46, 49, 50]].sum() 8933281.8757099714 sum(vector[[39, 46, 49, 50, 6, 9, 12, 14, 15, 17, 21]]) 8933281.8757099733 sum(vector[[6, 9, 12, 14, 15, 17, 21, 39, 46, 49, 50]]) 8933281.8757099714
Any thoughts?
Cheers,
Hanni
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
data:image/s3,"s3://crabby-images/bf13c/bf13cce92d5bd88375bfcb055b6d1333ce2fc94a" alt=""
Nadav Horesh wrote:
The highest accuracy is obtained when you sum an acceding ordered series, and the lowest accuracy with descending ordered. In between you might get a variety of rounding errors.
Nadav.
-----הודעה מקורית----- מאת: numpy-discussion-bounces@scipy.org בשם Hanni Ali נשלח: ג 09-דצמבר-08 16:07 אל: Discussion of Numerical Python נושא: [Numpy-discussion] Importance of order when summing values in anarray
Hi All,
I have encountered a puzzling issue and I am not certain if this is a mistake of my own doing or not. Would someone kindly just look over this issue to make sure I'm not doing something very silly.
So, why would the sum of an array have a different value depending on the order I select the indices of the array?
vector[[39, 46, 49, 50, 6, 9, 12, 14, 15, 17, 21]].sum()
8933281.8757099733
vector[[6, 9, 12, 14, 15, 17, 21, 39, 46, 49, 50]].sum()
8933281.8757099714
sum(vector[[39, 46, 49, 50, 6, 9, 12, 14, 15, 17, 21]])
8933281.8757099733
sum(vector[[6, 9, 12, 14, 15, 17, 21, 39, 46, 49, 50]])
8933281.8757099714
Any thoughts?
Cheers,
Hanni
------------------------------------------------------------------------
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Also, increase the numerical precision as that may depend on your platform especially given the input values above are ints. Numpy has float128 and int64 that will minimize rounding error. Bruce
data:image/s3,"s3://crabby-images/e0c0d/e0c0d5e1d377265ddb082fa432cbad5b41ef5388" alt=""
Hi Bruce, Ahh, but I would have thought the precision for the array operation would be the same no matter which values I wish to sum? The array is in float64 in all cases. I would not have thought altering the type of the integer values would make any difference as these indices are all below 5 milllion. Perhaps I have misunderstood your suggestion could you expand. Cheers, Hanni Also, increase the numerical precision as that may depend on your
platform especially given the input values above are ints. Numpy has float128 and int64 that will minimize rounding error.
Bruce _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
data:image/s3,"s3://crabby-images/bf13c/bf13cce92d5bd88375bfcb055b6d1333ce2fc94a" alt=""
Hanni Ali wrote:
Hi Bruce,
Ahh, but I would have thought the precision for the array operation would be the same no matter which values I wish to sum? The array is in float64 in all cases.
I would not have thought altering the type of the integer values would make any difference as these indices are all below 5 milllion.
Perhaps I have misunderstood your suggestion could you expand.
Cheers,
Hanni
Also, increase the numerical precision as that may depend on your platform especially given the input values above are ints. Numpy has float128 and int64 that will minimize rounding error.
Bruce _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org <mailto:Numpy-discussion@scipy.org> http://projects.scipy.org/mailman/listinfo/numpy-discussion
------------------------------------------------------------------------
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Hi, The main issue is the number of significant digits that you have which is not the number of decimals in your case. So while the numerical difference in the results is in the order about 1.86e-09, the actual difference starts at the 15th significant place. This is expected due to the number of significant digits of a 64-bit number (15-16). With higher precision like float128 you should get about 34 significant digits depending accuracy in all steps (i.e., the numbers must be stored as float128 and the summations done in float128 precision). Note there is a secondary issue of converting numbers between different types as well as the binary representation of decimal numbers. Also, rather than just simple summing, there are alternative algorithms like Kahan summation algorithm that can minimize errors. Bruce
data:image/s3,"s3://crabby-images/fbc0b/fbc0b3c618c5dd2899e426323763dcb5ea5511fa" alt=""
As much as I know float128 are in fact 80 bits (64 mantissa + 16 exponent) so the precision is 18-19 digits (not 34) Nadav. -----הודעה מקורית----- מאת: numpy-discussion-bounces@scipy.org בשם Bruce Southey נשלח: ג 09-דצמבר-08 17:46 אל: Discussion of Numerical Python נושא: Re: [Numpy-discussion] Importance of order when summing values in anarray Hanni Ali wrote:
Hi Bruce,
Ahh, but I would have thought the precision for the array operation would be the same no matter which values I wish to sum? The array is in float64 in all cases.
I would not have thought altering the type of the integer values would make any difference as these indices are all below 5 milllion.
Perhaps I have misunderstood your suggestion could you expand.
Cheers,
Hanni
Also, increase the numerical precision as that may depend on your platform especially given the input values above are ints. Numpy has float128 and int64 that will minimize rounding error.
Bruce _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org <mailto:Numpy-discussion@scipy.org> http://projects.scipy.org/mailman/listinfo/numpy-discussion
------------------------------------------------------------------------
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Hi, The main issue is the number of significant digits that you have which is not the number of decimals in your case. So while the numerical difference in the results is in the order about 1.86e-09, the actual difference starts at the 15th significant place. This is expected due to the number of significant digits of a 64-bit number (15-16). With higher precision like float128 you should get about 34 significant digits depending accuracy in all steps (i.e., the numbers must be stored as float128 and the summations done in float128 precision). Note there is a secondary issue of converting numbers between different types as well as the binary representation of decimal numbers. Also, rather than just simple summing, there are alternative algorithms like Kahan summation algorithm that can minimize errors. Bruce _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
data:image/s3,"s3://crabby-images/c4c8c/c4c8c9ee578d359a3234c68c5656728c7c864441" alt=""
On Tue, Dec 9, 2008 at 09:51, Nadav Horesh <nadavh@visionsense.com> wrote:
As much as I know float128 are in fact 80 bits (64 mantissa + 16 exponent) so the precision is 18-19 digits (not 34)
float128 should be 128 bits wide. If it's not on your platform, please let us know as that is a bug in your build. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
data:image/s3,"s3://crabby-images/e4aa6/e4aa6e420ae6ff6dcb338785e846cb1efd9d677a" alt=""
On Tue, Dec 9, 2008 at 1:40 PM, Robert Kern <robert.kern@gmail.com> wrote:
On Tue, Dec 9, 2008 at 09:51, Nadav Horesh <nadavh@visionsense.com> wrote:
As much as I know float128 are in fact 80 bits (64 mantissa + 16 exponent) so the precision is 18-19 digits (not 34)
float128 should be 128 bits wide. If it's not on your platform, please let us know as that is a bug in your build.
I think he means the actual precision is the ieee extended precision, the number just happens to be stored into larger chunks of memory for alignment purposes. Chuck
data:image/s3,"s3://crabby-images/c4c8c/c4c8c9ee578d359a3234c68c5656728c7c864441" alt=""
On Tue, Dec 9, 2008 at 21:01, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Tue, Dec 9, 2008 at 1:40 PM, Robert Kern <robert.kern@gmail.com> wrote:
On Tue, Dec 9, 2008 at 09:51, Nadav Horesh <nadavh@visionsense.com> wrote:
As much as I know float128 are in fact 80 bits (64 mantissa + 16 exponent) so the precision is 18-19 digits (not 34)
float128 should be 128 bits wide. If it's not on your platform, please let us know as that is a bug in your build.
I think he means the actual precision is the ieee extended precision, the number just happens to be stored into larger chunks of memory for alignment purposes.
Ah, that's good to know. Yes, float128 on my Intel Mac behaves this way. In [12]: f = finfo(float128) In [13]: f.nmant Out[13]: 63 In [14]: f.nexp Out[14]: 15 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
data:image/s3,"s3://crabby-images/e4aa6/e4aa6e420ae6ff6dcb338785e846cb1efd9d677a" alt=""
On Tue, Dec 9, 2008 at 8:10 PM, Robert Kern <robert.kern@gmail.com> wrote:
On Tue, Dec 9, 2008 at 21:01, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Tue, Dec 9, 2008 at 1:40 PM, Robert Kern <robert.kern@gmail.com>
wrote:
On Tue, Dec 9, 2008 at 09:51, Nadav Horesh <nadavh@visionsense.com>
wrote:
As much as I know float128 are in fact 80 bits (64 mantissa + 16 exponent) so the precision is 18-19 digits (not 34)
float128 should be 128 bits wide. If it's not on your platform, please let us know as that is a bug in your build.
I think he means the actual precision is the ieee extended precision, the number just happens to be stored into larger chunks of memory for alignment purposes.
Ah, that's good to know. Yes, float128 on my Intel Mac behaves this way.
In [12]: f = finfo(float128)
In [13]: f.nmant Out[13]: 63
In [14]: f.nexp Out[14]: 15
Yep. That's the reason I worry a bit about what will happen when ieee quad precision comes out; it really is 128 bits wide and the normal identifiers won't account for the difference. I expect c will just call them long doubles and they will get the 'g' letter code just like extended precision does now. Chuck
data:image/s3,"s3://crabby-images/dc6ff/dc6fff94deaff0c9743141600f3d91d9ca5c0b92" alt=""
On my two systems with Intel Core2 DUO, finfo(float128) gives me the nameerro, "NameError: name 'float128' is not defined". Why? Thanks Frank> Date: Tue, 9 Dec 2008 21:10:32 -0600> From: robert.kern@gmail.com> To: numpy-discussion@scipy.org> Subject: Re: [Numpy-discussion] Importance of order when summing values in anarray> > On Tue, Dec 9, 2008 at 21:01, Charles R Harris> <charlesr.harris@gmail.com> wrote:> >> >> > On Tue, Dec 9, 2008 at 1:40 PM, Robert Kern <robert.kern@gmail.com> wrote:> >>> >> On Tue, Dec 9, 2008 at 09:51, Nadav Horesh <nadavh@visionsense.com> wrote:> >> > As much as I know float128 are in fact 80 bits (64 mantissa + 16> >> > exponent) so the precision is 18-19 digits (not 34)> >>> >> float128 should be 128 bits wide. If it's not on your platform, please> >> let us know as that is a bug in your build.> >> > I think he means the actual precision is the ieee extended precision, the> > number just happens to be stored into larger chunks of memory for alignment> > purposes.> > Ah, that's good to know. Yes, float128 on my Intel Mac behaves this way.> > In [12]: f = finfo(float128)> > In [13]: f.nmant> Out[13]: 63> > In [14]: f.nexp> Out[14]: 15> > -- > Robert Kern> > "I have come to believe that the whole world is an enigma, a harmless> enigma that is made terrible by our own mad attempt to interpret it as> though it had an underlying truth."> -- Umberto Eco> _______________________________________________> Numpy-discussion mailing list> Numpy-discussion@scipy.org> http://projects.scipy.org/mailman/listinfo/numpy-discussion _________________________________________________________________ You live life online. So we put Windows on the web. http://clk.atdmt.com/MRT/go/127032869/direct/01/
data:image/s3,"s3://crabby-images/e4aa6/e4aa6e420ae6ff6dcb338785e846cb1efd9d677a" alt=""
On Wed, Dec 10, 2008 at 11:00 AM, frank wang <f.yw@hotmail.com> wrote:
On my two systems with Intel Core2 DUO, finfo(float128) gives me the nameerro, "NameError: name 'float128' is not defined". Why?
You probably run a 32 bit OS. IEEE extended precision is 80 bits. On 32 bit systems it fits in three 32 bit words and shows up as float96. On 64 bit systems it fits in two 64 bit words and shows up as float128. Chuck
data:image/s3,"s3://crabby-images/c4c8c/c4c8c9ee578d359a3234c68c5656728c7c864441" alt=""
On Wed, Dec 10, 2008 at 12:07, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Wed, Dec 10, 2008 at 11:00 AM, frank wang <f.yw@hotmail.com> wrote:
On my two systems with Intel Core2 DUO, finfo(float128) gives me the nameerro, "NameError: name 'float128' is not defined". Why?
You probably run a 32 bit OS. IEEE extended precision is 80 bits. On 32 bit systems it fits in three 32 bit words and shows up as float96. On 64 bit systems it fits in two 64 bit words and shows up as float128.
I'm running a 32-bit OS (well, a 32-bit build of Python on OS X) on an Intel Core2 Duo, and I get a float128. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
data:image/s3,"s3://crabby-images/e4aa6/e4aa6e420ae6ff6dcb338785e846cb1efd9d677a" alt=""
On Wed, Dec 10, 2008 at 11:58 AM, Robert Kern <robert.kern@gmail.com> wrote:
On Wed, Dec 10, 2008 at 12:07, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Wed, Dec 10, 2008 at 11:00 AM, frank wang <f.yw@hotmail.com> wrote:
On my two systems with Intel Core2 DUO, finfo(float128) gives me the nameerro, "NameError: name 'float128' is not defined". Why?
You probably run a 32 bit OS. IEEE extended precision is 80 bits. On 32
bit
systems it fits in three 32 bit words and shows up as float96. On 64 bit systems it fits in two 64 bit words and shows up as float128.
I'm running a 32-bit OS (well, a 32-bit build of Python on OS X) on an Intel Core2 Duo, and I get a float128.
-- Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
data:image/s3,"s3://crabby-images/e4aa6/e4aa6e420ae6ff6dcb338785e846cb1efd9d677a" alt=""
On Wed, Dec 10, 2008 at 11:58 AM, Robert Kern <robert.kern@gmail.com> wrote:
On Wed, Dec 10, 2008 at 12:07, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Wed, Dec 10, 2008 at 11:00 AM, frank wang <f.yw@hotmail.com> wrote:
On my two systems with Intel Core2 DUO, finfo(float128) gives me the nameerro, "NameError: name 'float128' is not defined". Why?
You probably run a 32 bit OS. IEEE extended precision is 80 bits. On 32
bit
systems it fits in three 32 bit words and shows up as float96. On 64 bit systems it fits in two 64 bit words and shows up as float128.
I'm running a 32-bit OS (well, a 32-bit build of Python on OS X) on an Intel Core2 Duo, and I get a float128.
Curious. It probably has something to do with the way the FPU is set up when running on a 64 bit system that is independent of how python is compiled. Chuck
data:image/s3,"s3://crabby-images/fbc0b/fbc0b3c618c5dd2899e426323763dcb5ea5511fa" alt=""
float128 are 16 bytes wide but have the structure of x87 80-bits + extra 6 bytes for alignment:
From "http://lwn.net/2001/features/OLS/pdf/pdf/x86-64.pdf": "... The x87 stack with 80-bit precision is only used for long double."
And:
e47 = float128(1e-47) e30 = float128(1e-30) e50 = float128(1e-50) (e30-e50) == e30 True (e30-e47) == e30 False
This shows that float128 has no more then 19 digits precision Nadav. -----הודעה מקורית----- מאת: numpy-discussion-bounces@scipy.org בשם Robert Kern נשלח: ג 09-דצמבר-08 22:40 אל: Discussion of Numerical Python נושא: Re: [Numpy-discussion] Importance of order when summing values inanarray On Tue, Dec 9, 2008 at 09:51, Nadav Horesh <nadavh@visionsense.com> wrote:
As much as I know float128 are in fact 80 bits (64 mantissa + 16 exponent) so the precision is 18-19 digits (not 34)
float128 should be 128 bits wide. If it's not on your platform, please let us know as that is a bug in your build. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
data:image/s3,"s3://crabby-images/4ca10/4ca10dda79a803b76526d4cf3f3fdfad9aef5d50" alt=""
I found one solution that's pretty simple for easy read and write to/from a file of a numpy array (see my original message below). Just use the method tolist(). e.g. a complex 2 x 2 array arr=array([[1.0,3.0-7j],[55.2+4.0j,-95.34]]) ls=arr.tolist() Then use the repr - eval pairings to write and later read the list from the file and then convert the list that is read in back to an array: [ls_str]=fp.readline() ls_in= eval(ls_str) arr_in=array(ls_in) # arr_in is same as arr Seems to work well. Any comments? -- Lou Pecora, my views are my own. --- On Tue, 12/9/08, Lou Pecora wrote: In looking for simple ways to read and write data (in a text readable format) to and from a file and later restoring the actual data when reading back in, I've found that numpy arrays don't seem to play well with repr and eval. E.g. to write some data (mixed types) to a file I can do this (fp is an open file), thedata=[3.0,-4.9+2.0j,'another string'] repvars= repr(thedata)+"\n" fp.write(repvars) Then to read it back and restore the data each to its original type, strvars= fp.readline() sonofdata= eval(strvars) which gives back the original data list. BUT when I try this with numpy arrays in the data list I find that repr of an array adds extra end-of-lines and that messes up the simple restoration of the data using eval. Am I missing something simple? I know I've seen people recommend ways to save arrays to files, but I'm wondering what is the most straight-forward? I really like the simple, pythonic approach of the repr - eval pairing.
participants (7)
-
Bruce Southey
-
Charles R Harris
-
frank wang
-
Hanni Ali
-
Lou Pecora
-
Nadav Horesh
-
Robert Kern