[Numpy-discussion] Should ndarray be a context manager?

Eelco Hoogendoorn hoogendoorn.eelco at gmail.com
Tue Dec 9 11:05:08 EST 2014


My impression is that this level of optimization does and should not fall within the scope of numpy.. 

-----Original Message-----
From: "Sturla Molden" <sturla.molden at gmail.com>
Sent: ‎9-‎12-‎2014 16:02
To: "numpy-discussion at scipy.org" <numpy-discussion at scipy.org>
Subject: [Numpy-discussion] Should ndarray be a context manager?


I wonder if ndarray should be a context manager so we can write 
something like this:


    with np.zeros(n) as x:
       [...]


The difference should be that __exit__ should free the memory in x (if 
owned by x) and make x a zero size array.

Unlike the current ndarray, which does not have an __exit__ method, this 
would give precise control over when the memory is freed. The timing of 
the memory release would not be dependent on the Python implementation, 
and a reference cycle or reference leak would not accidentally produce a 
memory leak. It would allow us to deterministically decide when the 
memory should be freed, which e.g. is useful when we work with large arrays.


A problem with this is that the memory in the ndarray would be volatile 
with respect to other Python threads and view arrays. However, there are 
dozens of other ways to produce segfaults or buffer overflows with NumPy 
(cf. stride_tricks or wrapping external buffers).


Below is a Cython class that does something similar, but we would need 
to e.g. write something like

     with Heapmem(n * np.double().itemsize) as hm:
         x = hm.doublearray
         [...]

instead of just

     with np.zeros(n) as x:
         [...]


Sturla


# (C) 2014 Sturla Molden

from cpython cimport PyMem_Malloc, PyMem_Free
from libc.string cimport memset
cimport numpy as cnp
cnp.init_array()


cdef class Heapmem:

     cdef:
         void *_pointer
         cnp.intp_t _size

     def __cinit__(Heapmem self, Py_ssize_t n):
         self._pointer = NULL
         self._size = <cnp.intp_t> n

     def __init__(Heapmem self, Py_ssize_t n):
         self.allocate()

     def allocate(Heapmem self):
         if self._pointer != NULL:
             raise RuntimeError("Memory already allocated")
         else:
             self._pointer = PyMem_Malloc(self._size)
             if (self._pointer == NULL):
                 raise MemoryError()
             memset(self._pointer, 0, self._size)

     def __dealloc__(Heapmem self):
         if self._pointer != NULL:
             PyMem_Free(self._pointer)
             self._pointer = NULL

     property pointer:
         def __get__(Heapmem self):
             return <cnp.intp_t> self._pointer

     property doublearray:
         def __get__(Heapmem self):
             cdef cnp.intp_t n = self._size//sizeof(double)
             if self._pointer != NULL:
                 return cnp.PyArray_SimpleNewFromData(1, &n,
                                  cnp.NPY_DOUBLE, self._pointer)
             else:
                 raise RuntimeError("Memory not allocated")

     property chararray:
         def __get__(Heapmem self):
             if self._pointer != NULL:
                 return cnp.PyArray_SimpleNewFromData(1, &self._size,
                                  cnp.NPY_CHAR, self._pointer)
             else:
                 raise RuntimeError("Memory not allocated")

     def __enter__(self):
         if self._pointer != NULL:
             raise RuntimeError("Memory not allocated")

     def __exit__(Heapmem self, type, value, traceback):
         if self._pointer != NULL:
             PyMem_Free(self._pointer)
             self._pointer = NULL





_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20141209/ed044f06/attachment.html>


More information about the NumPy-Discussion mailing list