Summation of large float32/float64 arrays
Hi all,
I have a possibly naive question. I don't really understand this particular set of output:
In [1]: import numpy
In [2]: a1 = numpy.random.random((512,512,512)).astype("float32")
In [3]: a1.sum(axis=0).sum(axis=0).sum(axis=0) Out[3]: 67110312.0
In [4]: a1.sum() Out[4]: 16777216.0
I recognize that the intermediate sums may accumulate error differently than a single call to .sum(), but I guess my concern is that it's accumulating a lot faster than I anticipated. (Interesting to note that a1.sum() returns 0.5*512^3, down to the decimal; is it summing up the mean, which should be ~0.5?) However, with a 256^3 array:
In [1]: import numpy
In [2]: a1 = numpy.random.random((256,256,256)).astype("float32")
In [3]: a1.sum(axis=0).sum(axis=0).sum(axis=0) Out[3]: 8389703.0
In [4]: a1.sum() Out[4]: 8389245.0
The errors are much more reasonable. Is there an overflow or something that occurs with the 512^3? These problems all go completely away with a float64 array, but the issue originally showed up when trying to normalize an ondisk float32 array of size 512^3, where the normalization factor was off by a substantial factor (>2x) depending on the mechanism used to sum. My suspicion is that perhaps I have a naive misconception about intermediate steps in summations, or there is a subtlety I'm missing here.
I placed a sample script I used to test this here:
Thanks for any insight anybody can provide,
Matt
On Fri, May 21, 2010 at 15:13, Matthew Turk matthewturk@gmail.com wrote:
Hi all,
I have a possibly naive question. I don't really understand this particular set of output:
In [1]: import numpy
In [2]: a1 = numpy.random.random((512,512,512)).astype("float32")
In [3]: a1.sum(axis=0).sum(axis=0).sum(axis=0) Out[3]: 67110312.0
In [4]: a1.sum() Out[4]: 16777216.0
I recognize that the intermediate sums may accumulate error differently than a single call to .sum(), but I guess my concern is that it's accumulating a lot faster than I anticipated. (Interesting to note that a1.sum() returns 0.5*512^3, down to the decimal; is it summing up the mean, which should be ~0.5?) However, with a 256^3 array:
In [1]: import numpy
In [2]: a1 = numpy.random.random((256,256,256)).astype("float32")
In [3]: a1.sum(axis=0).sum(axis=0).sum(axis=0) Out[3]: 8389703.0
In [4]: a1.sum() Out[4]: 8389245.0
The errors are much more reasonable. Is there an overflow or something that occurs with the 512^3?
It's not quite an overflow.
In [1]: from numpy import *
In [2]: x = float32(16777216.0)
In [3]: x + float32(0.9) Out[3]: 16777216.0
You are accumulating your result in a float32. With the a.sum() approach, you eventually hit a level where the next number to add is always less than the relative epsilon of float32 precision. So the result doesn't change. And will never change again as long as you only add one number at a time. Summing along the other axes creates smaller intermediate sums such that you are usually adding together numbers roughly in the same regime as each other, so you don't lose as much precision.
Use a.sum(dtype=np.float64) to use a float64 accumulator.
Hi Robert,
It's not quite an overflow.
In [1]: from numpy import *
In [2]: x = float32(16777216.0)
In [3]: x + float32(0.9) Out[3]: 16777216.0
You are accumulating your result in a float32. With the a.sum() approach, you eventually hit a level where the next number to add is always less than the relative epsilon of float32 precision. So the result doesn't change. And will never change again as long as you only add one number at a time. Summing along the other axes creates smaller intermediate sums such that you are usually adding together numbers roughly in the same regime as each other, so you don't lose as much precision.
Thank you very much for that explanation; that completely makes sense.
Use a.sum(dtype=np.float64) to use a float64 accumulator.
I didn't know about the dtype for accumulators/operators  that did just the trick.
Much obliged,
Matt
On 5/21/2010 4:13 PM, Matthew Turk wrote:
a1 = numpy.random.random((512,512,512)).astype("float32")
This consistently gives me a "MemoryError". I believe I have plenty of physical memory. (That should require about 1.3G during creation, right? I have 8G.) It seems I'm hitting some kind of 1G memory use limit.How to think about this?
I can create the initial 64 bit array no problem. However I cannot create the second 32 bit array, despite having plenty of physical memory. The copy method also fails; or even creating a second array the same size fails, unless I first delete `a1`.
I realize this is probably a naive and operating system dependent question. Apologies if it is off topic.
Thanks, Alan Isaac (running 32bit Python 2.6.5 on 64bit Vista; NumPy version import 1.4.1rc2)
On 5/21/2010 2:30 PM, Alan G Isaac wrote:
On 5/21/2010 4:13 PM, Matthew Turk wrote:
a1 = numpy.random.random((512,512,512)).astype("float32")
This consistently gives me a "MemoryError". I believe I have plenty of physical memory. (That should require about 1.3G during creation, right? I have 8G.) It seems I'm hitting some kind of 1G memory use limit.How to think about this?
I can create the initial 64 bit array no problem. However I cannot create the second 32 bit array, despite having plenty of physical memory. The copy method also fails; or even creating a second array the same size fails, unless I first delete `a1`.
I realize this is probably a naive and operating system dependent question. Apologies if it is off topic.
Thanks, Alan Isaac (running 32bit Python 2.6.5 on 64bit Vista; NumPy version import 1.4.1rc2)
The MemoryError is not not unexpected. First, the 32bit Python interpreter can only use 2 GB of your memory. Second, numpy arrays need a contiguous, unfragmented, range of memory to be available within that 2 GB address space. In this example there is no contiguous 512 MB space left after creating the 1 GB array. Practically it does not seem possible to allocate a single array larger than about 1.3 GB on win32. The solution is to use a 64bit version of Python and numpy.
Christoph
On 5/23/2010 6:04 PM, Alan G Isaac wrote:
On 5/21/2010 8:34 PM, Christoph Gohlke wrote:
the 32bit Python interpreter can only use 2 GB of your memory
Why?
>>> 2**32/1e9 4.2949672960000003
Because 2 GB is the memory limit for 32bit processes running in usermode under 64bit Windows, unless the executable was specifically built with 'IMAGE_FILE_LARGE_ADDRESS_AWARE' set.
See http://msdn.microsoft.com/enus/library/aa366778%28VS.85%29.aspx
On 5/23/2010 9:33 PM, Christoph Gohlke wrote:
2 GB is the memory limit for 32bit processes running in usermode under 64bit Windows, unless the executable was specifically built with 'IMAGE_FILE_LARGE_ADDRESS_AWARE' set.
Seehttp://msdn.microsoft.com/enus/library/aa366778%28VS.85%29.aspx
Thanks! Alan
participants (4)

Alan G Isaac

Christoph Gohlke

Matthew Turk

Robert Kern