Catching out-of-memory error before it happens
I want to write a general exception handler to warn if too much data is being loaded for the ram size in a machine for a successful numpy array operation to take place. For example, the program multiplies two floating point arrays A and B which are populated with loadtext. While the data is being loaded, want to continuously check that the data volume doesn't pass a threshold that will cause on out-of-memory error during the A*B operation. The known variables are the amount of memory available, data type (floats in this case) and the numpy array operation to be performed. It seems this requires knowledge of the internal memory requirements of each numpy operation. For sake of simplicity, can ignore other memory needs of program. Is this possible?
There is no reliable way to predict how much memory an arbitrary numpy operation will need, no. However, in most cases the main memory cost will be simply the need to store the input and output arrays; for large arrays, all other allocations should be negligible. The most effective way to avoid running out of memory, therefore, is to avoid creating temporary arrays, by using only in-place operations. E.g., if a and b each require N bytes of ram, then memory requirements (roughly). c = a + b: 3N c = a + 2*b: 4N a += b: 2N np.add(a, b, out=a): 2N b *= 2; a += b: 2N Note that simply loading a and b requires 2N memory, so the latter code samples are near-optimal. Of course some calculations do require the use of temporary storage space... -n On 24 Jan 2014 15:19, "Dinesh Vadhia" <dineshbvadhia@hotmail.com> wrote:
I want to write a general exception handler to warn if too much data is being loaded for the ram size in a machine for a successful numpy array operation to take place. For example, the program multiplies two floating point arrays A and B which are populated with loadtext. While the data is being loaded, want to continuously check that the data volume doesn't pass a threshold that will cause on out-of-memory error during the A*B operation. The known variables are the amount of memory available, data type (floats in this case) and the numpy array operation to be performed. It seems this requires knowledge of the internal memory requirements of each numpy operation. For sake of simplicity, can ignore other memory needs of program. Is this possible?
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Yeah, numexpr is pretty cool for avoiding temporaries in an easy way: https://github.com/pydata/numexpr Francesc El 24/01/14 16:30, Nathaniel Smith ha escrit:
There is no reliable way to predict how much memory an arbitrary numpy operation will need, no. However, in most cases the main memory cost will be simply the need to store the input and output arrays; for large arrays, all other allocations should be negligible.
The most effective way to avoid running out of memory, therefore, is to avoid creating temporary arrays, by using only in-place operations.
E.g., if a and b each require N bytes of ram, then memory requirements (roughly).
c = a + b: 3N c = a + 2*b: 4N a += b: 2N np.add(a, b, out=a): 2N b *= 2; a += b: 2N
Note that simply loading a and b requires 2N memory, so the latter code samples are near-optimal.
Of course some calculations do require the use of temporary storage space...
-n
On 24 Jan 2014 15:19, "Dinesh Vadhia" <dineshbvadhia@hotmail.com <mailto:dineshbvadhia@hotmail.com>> wrote:
I want to write a general exception handler to warn if too much data is being loaded for the ram size in a machine for a successful numpy array operation to take place. For example, the program multiplies two floating point arrays A and B which are populated with loadtext. While the data is being loaded, want to continuously check that the data volume doesn't pass a threshold that will cause on out-of-memory error during the A*B operation. The known variables are the amount of memory available, data type (floats in this case) and the numpy array operation to be performed. It seems this requires knowledge of the internal memory requirements of each numpy operation. For sake of simplicity, can ignore other memory needs of program. Is this possible?
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org <mailto:NumPy-Discussion@scipy.org> http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
-- Francesc Alted
Francesc: Thanks. I looked at numexpr a few years back but it didn't support array slicing/indexing. Has that changed?
On 24 January 2014 23:09, Dinesh Vadhia <dineshbvadhia@hotmail.com> wrote:
Francesc: Thanks. I looked at numexpr a few years back but it didn't support array slicing/indexing. Has that changed?
No, but you can do it yourself. big_array = np.empty(20000) piece = big_array[30:-50] ne.evaluate('sqrt(piece)') Here, creating "piece" does not increase memory use, as slicing shares the original data (well, actually, it adds a mere 80 bytes, the overhead of an array).
c = a + b: 3N c = a + 2*b: 4N Does python garbage collect mid-expression? I.e. : C = (a + 2*b) + b 4 or 5 N? Also note that when memory gets tight, fragmentation can be a problem. I.e. if two size-n arrays where just freed, you still may not be able to allocate a size-2n array. This seems to be worse on windows, not sure why. a += b: 2N np.add(a, b, out=a): 2N b *= 2; a += b: 2N Note that simply loading a and b requires 2N memory, so the latter code samples are near-optimal. And will run quite a bit faster for large arrays--pushing that memory around takes time. -Chris
On 24 Jan 2014 15:57, "Chris Barker - NOAA Federal" <chris.barker@noaa.gov> wrote:
c = a + b: 3N c = a + 2*b: 4N
Does python garbage collect mid-expression? I.e. :
C = (a + 2*b) + b
4 or 5 N?
It should be collected as soon as the reference gets dropped, so 4N. (This is the advantage of a greedy refcounting collector.)
Also note that when memory gets tight, fragmentation can be a problem. I.e. if two size-n arrays where just freed, you still may not be able to allocate a size-2n array. This seems to be worse on windows, not sure why.
If your arrays are big enough that you're worried that making a stray copy will ENOMEM, then you *shouldn't* have to worry about fragmentation - malloc will give each array its own virtual mapping, which can be backed by discontinuous physical memory. (I guess it's possible windows has a somehow shoddy VM system and this isn't true, but that seems unlikely these days?) Memory fragmentation is more a problem if you're allocating lots of small objects of varying sizes. On 32 bit, virtual address fragmentation could also be a problem, but if you're working with giant data sets then you need 64 bits anyway :-). -n
On Fri, Jan 24, 2014 at 8:25 AM, Nathaniel Smith <njs@pobox.com> wrote:
If your arrays are big enough that you're worried that making a stray copy will ENOMEM, then you *shouldn't* have to worry about fragmentation - malloc will give each array its own virtual mapping, which can be backed by discontinuous physical memory. (I guess it's possible windows has a somehow shoddy VM system and this isn't true, but that seems unlikely these days?)
All I know is that when I push the limits with memory on a 32 bit Windows system, it often crashed out when I've never seen more than about 1GB of memory use by the application -- I would have thought that would be plenty of overhead. I also know that I've reached limits onWindows32 well before OS_X 32, but that may be because IIUC, Windows32 only allows 2GB per process, whereas OS-X32 allows 4GB per process. Memory fragmentation is more a problem if you're allocating lots of small
objects of varying sizes.
It could be that's what I've been doing.... On 32 bit, virtual address fragmentation could also be a problem, but if
you're working with giant data sets then you need 64 bits anyway :-).
well, "giant" is defined relative to the system capabilities... but yes, if you're pushing the limits of a 32 bit system , the easiest thing to do is go to 64bits and some more memory! -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On Fri, Jan 24, 2014 at 10:29 PM, Chris Barker <chris.barker@noaa.gov> wrote:
On Fri, Jan 24, 2014 at 8:25 AM, Nathaniel Smith <njs@pobox.com> wrote:
If your arrays are big enough that you're worried that making a stray copy will ENOMEM, then you *shouldn't* have to worry about fragmentation - malloc will give each array its own virtual mapping, which can be backed by discontinuous physical memory. (I guess it's possible windows has a somehow shoddy VM system and this isn't true, but that seems unlikely these days?)
All I know is that when I push the limits with memory on a 32 bit Windows system, it often crashed out when I've never seen more than about 1GB of memory use by the application -- I would have thought that would be plenty of overhead.
I also know that I've reached limits onWindows32 well before OS_X 32, but that may be because IIUC, Windows32 only allows 2GB per process, whereas OS-X32 allows 4GB per process.
Memory fragmentation is more a problem if you're allocating lots of small objects of varying sizes.
It could be that's what I've been doing....
On 32 bit, virtual address fragmentation could also be a problem, but if you're working with giant data sets then you need 64 bits anyway :-).
well, "giant" is defined relative to the system capabilities... but yes, if you're pushing the limits of a 32 bit system , the easiest thing to do is go to 64bits and some more memory!
Oh, yeah, common confusion. Allowing 2 GiB of address space per process doesn't mean you can actually practically use 2 GiB of *memory* per process, esp. if you're allocating/deallocating a mix of large and small objects, because address space fragmentation will kill you way before that. The memory is there, there isn't anywhere to slot it into the process's address space. So you don't need to add more memory, just switch to a 64-bit OS. On 64-bit you have oodles of address space, so the memory manager can easily slot in large objects far away from small objects, and it's only fragmentation within each small-object arena that hurts. A good malloc will keep this overhead down pretty low though -- certainly less than the factor of two you're thinking about. -n
So, with the example case, the approximate memory cost for an in-place operation would be: A *= B : 2N But, if the original A or B is to remain unchanged then it will be: C = A * B : 3N ?
Yes. On 24 Jan 2014 17:19, "Dinesh Vadhia" <dineshbvadhia@hotmail.com> wrote:
So, with the example case, the approximate memory cost for an in-place operation would be:
A *= B : 2N
But, if the original A or B is to remain unchanged then it will be:
C = A * B : 3N ?
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (6)
-
Chris Barker
-
Chris Barker - NOAA Federal
-
Daπid
-
Dinesh Vadhia
-
Francesc Alted
-
Nathaniel Smith