[AstroPy] FITS_rec objects retain references to 'parent' objects, even when copied

André Luiz de Amorim streetomon at gmail.com
Fri Jun 13 08:59:56 EDT 2014


I had similar problems, the file descriptor of the fits file did not get
freed after getting out of scope. As I was loading data from tons of files,
I got "too many files open" exceptions. I did not investigate memory
allocation, but it probably had the same problem you described. The cause,
I suppose, is that pyfits uses memory mapped files, so that while the
memory is still referenced, the fits object does not get collected. In my
code, making the equivalent of bb = f[1].data[index].copy() solved the
issue.

Cheers,

André.



On Wed, Jun 11, 2014 at 11:29 PM, Benjamin Alan Weaver <
benjamin.weaver at nyu.edu> wrote:

> Hello y'all,
>
> Apologies if this is a well known, documented problem.  You can just
> send me the URL if that's the case.
>
> Let's say I create a large numpy array:
>
> def foo():
>    import numpy as np
>    from numpy.random import random
>    a = random((100000,100))
>    index = np.arange(0,100000,5)
>    b = a[index,:] # Use a complex index to force copy-on-slice
>    assert not np.may_share_memory(a,b) # b is a copy, not a view.
>    return b
>
> b = foo()
>
> After the call to foo(), only the memory associated with b remains.  a
> goes out of scope, its memory is garbage-collected, everything is
> fine.
>
> Now suppose I do something similar with astropy.io.fits (assume that
> large_fits_file is a ~GB FITS file containing a FITS binary table in
> HDU 1):
>
> def bar(large_fits_file):
>    import numpy as np
>    import astropy.io.fits as pyfits
>    f = pyfits.open(large_fits_file)
>    index = np.arange(0,100000,5)
>    bb = f[1].data[index]
>    assert not np.may_share_memory(f[1].data,bb) # bb is a copy, not a view.
>    return bb
>
> bb = bar(large_fits_file)
>
> After the call to bar, all the memory associated with f[1].data
> remains in memory, even though it has gone out of scope.  Only when bb
> goes out of scope (or del bb) does the memory get released, even
> though we have shown that bb is a copy of the data not a view.
> Somehow bb is retaining a reference to the original data.
>
> The attached script uses the memory_profiler package
> (https://pypi.python.org/pypi/memory_profiler) to demonstrate this.
>
> So, what's going on here?
>
> Kia ora koutou,
> Benjamin Alan Weaver
>
> --
> a.k.a. The Dream Weaver
>
> Outside of a dog, a book is man's best friend. Inside of a dog it's
> too dark to read.
>   --Groucho Marx
>
> _______________________________________________
> AstroPy mailing list
> AstroPy at scipy.org
> http://mail.scipy.org/mailman/listinfo/astropy
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/astropy/attachments/20140613/c20bb36f/attachment.html>


More information about the AstroPy mailing list