[Numpy-discussion] Catching out-of-memory error before it happens

Nathaniel Smith njs at pobox.com
Fri Jan 24 11:25:37 EST 2014


On 24 Jan 2014 15:57, "Chris Barker - NOAA Federal" <chris.barker at noaa.gov>
wrote:
>
>
>> c = a + b: 3N
>> c = a + 2*b: 4N
>
> Does python garbage collect mid-expression? I.e. :
>
> C = (a + 2*b) + b
>
> 4 or 5 N?

It should be collected as soon as the reference gets dropped, so 4N. (This
is the advantage of a greedy refcounting collector.)

> Also note that when memory gets tight, fragmentation can be a problem.
I.e. if two size-n arrays where just freed, you still may not be able to
allocate a size-2n array. This seems to be worse on windows, not sure why.

If your arrays are big enough that you're worried that making a stray copy
will ENOMEM, then you *shouldn't* have to worry about fragmentation -
malloc will give each array its own virtual mapping, which can be backed by
discontinuous physical memory. (I guess it's possible windows has a somehow
shoddy VM system and this isn't true, but that seems unlikely these days?)

Memory fragmentation is more a problem if you're allocating lots of small
objects of varying sizes.

On 32 bit, virtual address fragmentation could also be a problem, but if
you're working with giant data sets then you need 64 bits anyway :-).

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140124/66564ce7/attachment.html>


More information about the NumPy-Discussion mailing list