[Numpy-discussion] Should ndarray be a context manager?

Chris Barker chris.barker at noaa.gov
Tue Dec 9 15:15:05 EST 2014


On Tue, Dec 9, 2014 at 7:01 AM, Sturla Molden <sturla.molden at gmail.com>
wrote:

>
> I wonder if ndarray should be a context manager so we can write
> something like this:
>
>
>     with np.zeros(n) as x:
>        [...]
>
>
> The difference should be that __exit__ should free the memory in x (if
> owned by x) and make x a zero size array.
>

my  first thought iust that you can just do:

x = np.zeros(n)
[... your code here ]
del x

x's ref count will go down, and it will be deleted  if there are no other
references to it. If there Are other references to it, you really wouldn't
want to delete the memory buffer anyway, would you?

At it happens cPython's reference counting scheme DOES enforce deletion at
determinate times.

I suppose you could write a generic context manger that would do the del
for you, but I'm not sure what the point would be.

Note that id numpy were to do this, then there would need to be machinery
in place to check for null data blocks in a numpy array -- kind of like how
a file object can close the underlying file pointer and not crash if
someone tries to use it again.

I guess this comes down to -- why would anyone want/need a numpy array
object with no underlying data?

(although I'm still confused as to why it's so important (in cPython) to
have a file context manager..)

-CHB







> Unlike the current ndarray, which does not have an __exit__ method, this
> would give precise control over when the memory is freed. The timing of
> the memory release would not be dependent on the Python implementation,
> and a reference cycle or reference leak would not accidentally produce a
> memory leak. It would allow us to deterministically decide when the
> memory should be freed, which e.g. is useful when we work with large
> arrays.
>
>
> A problem with this is that the memory in the ndarray would be volatile
> with respect to other Python threads and view arrays. However, there are
> dozens of other ways to produce segfaults or buffer overflows with NumPy
> (cf. stride_tricks or wrapping external buffers).
>
>
> Below is a Cython class that does something similar, but we would need
> to e.g. write something like
>
>      with Heapmem(n * np.double().itemsize) as hm:
>          x = hm.doublearray
>          [...]
>
> instead of just
>
>      with np.zeros(n) as x:
>          [...]
>
>
> Sturla
>
>
> # (C) 2014 Sturla Molden
>
> from cpython cimport PyMem_Malloc, PyMem_Free
> from libc.string cimport memset
> cimport numpy as cnp
> cnp.init_array()
>
>
> cdef class Heapmem:
>
>      cdef:
>          void *_pointer
>          cnp.intp_t _size
>
>      def __cinit__(Heapmem self, Py_ssize_t n):
>          self._pointer = NULL
>          self._size = <cnp.intp_t> n
>
>      def __init__(Heapmem self, Py_ssize_t n):
>          self.allocate()
>
>      def allocate(Heapmem self):
>          if self._pointer != NULL:
>              raise RuntimeError("Memory already allocated")
>          else:
>              self._pointer = PyMem_Malloc(self._size)
>              if (self._pointer == NULL):
>                  raise MemoryError()
>              memset(self._pointer, 0, self._size)
>
>      def __dealloc__(Heapmem self):
>          if self._pointer != NULL:
>              PyMem_Free(self._pointer)
>              self._pointer = NULL
>
>      property pointer:
>          def __get__(Heapmem self):
>              return <cnp.intp_t> self._pointer
>
>      property doublearray:
>          def __get__(Heapmem self):
>              cdef cnp.intp_t n = self._size//sizeof(double)
>              if self._pointer != NULL:
>                  return cnp.PyArray_SimpleNewFromData(1, &n,
>                                   cnp.NPY_DOUBLE, self._pointer)
>              else:
>                  raise RuntimeError("Memory not allocated")
>
>      property chararray:
>          def __get__(Heapmem self):
>              if self._pointer != NULL:
>                  return cnp.PyArray_SimpleNewFromData(1, &self._size,
>                                   cnp.NPY_CHAR, self._pointer)
>              else:
>                  raise RuntimeError("Memory not allocated")
>
>      def __enter__(self):
>          if self._pointer != NULL:
>              raise RuntimeError("Memory not allocated")
>
>      def __exit__(Heapmem self, type, value, traceback):
>          if self._pointer != NULL:
>              PyMem_Free(self._pointer)
>              self._pointer = NULL
>
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20141209/dccf88f1/attachment.html>


More information about the NumPy-Discussion mailing list