[Numpy-discussion] Should ndarray be a context manager?
Eelco Hoogendoorn
hoogendoorn.eelco at gmail.com
Tue Dec 9 11:05:08 EST 2014
My impression is that this level of optimization does and should not fall within the scope of numpy..
-----Original Message-----
From: "Sturla Molden" <sturla.molden at gmail.com>
Sent: 9-12-2014 16:02
To: "numpy-discussion at scipy.org" <numpy-discussion at scipy.org>
Subject: [Numpy-discussion] Should ndarray be a context manager?
I wonder if ndarray should be a context manager so we can write
something like this:
with np.zeros(n) as x:
[...]
The difference should be that __exit__ should free the memory in x (if
owned by x) and make x a zero size array.
Unlike the current ndarray, which does not have an __exit__ method, this
would give precise control over when the memory is freed. The timing of
the memory release would not be dependent on the Python implementation,
and a reference cycle or reference leak would not accidentally produce a
memory leak. It would allow us to deterministically decide when the
memory should be freed, which e.g. is useful when we work with large arrays.
A problem with this is that the memory in the ndarray would be volatile
with respect to other Python threads and view arrays. However, there are
dozens of other ways to produce segfaults or buffer overflows with NumPy
(cf. stride_tricks or wrapping external buffers).
Below is a Cython class that does something similar, but we would need
to e.g. write something like
with Heapmem(n * np.double().itemsize) as hm:
x = hm.doublearray
[...]
instead of just
with np.zeros(n) as x:
[...]
Sturla
# (C) 2014 Sturla Molden
from cpython cimport PyMem_Malloc, PyMem_Free
from libc.string cimport memset
cimport numpy as cnp
cnp.init_array()
cdef class Heapmem:
cdef:
void *_pointer
cnp.intp_t _size
def __cinit__(Heapmem self, Py_ssize_t n):
self._pointer = NULL
self._size = <cnp.intp_t> n
def __init__(Heapmem self, Py_ssize_t n):
self.allocate()
def allocate(Heapmem self):
if self._pointer != NULL:
raise RuntimeError("Memory already allocated")
else:
self._pointer = PyMem_Malloc(self._size)
if (self._pointer == NULL):
raise MemoryError()
memset(self._pointer, 0, self._size)
def __dealloc__(Heapmem self):
if self._pointer != NULL:
PyMem_Free(self._pointer)
self._pointer = NULL
property pointer:
def __get__(Heapmem self):
return <cnp.intp_t> self._pointer
property doublearray:
def __get__(Heapmem self):
cdef cnp.intp_t n = self._size//sizeof(double)
if self._pointer != NULL:
return cnp.PyArray_SimpleNewFromData(1, &n,
cnp.NPY_DOUBLE, self._pointer)
else:
raise RuntimeError("Memory not allocated")
property chararray:
def __get__(Heapmem self):
if self._pointer != NULL:
return cnp.PyArray_SimpleNewFromData(1, &self._size,
cnp.NPY_CHAR, self._pointer)
else:
raise RuntimeError("Memory not allocated")
def __enter__(self):
if self._pointer != NULL:
raise RuntimeError("Memory not allocated")
def __exit__(Heapmem self, type, value, traceback):
if self._pointer != NULL:
PyMem_Free(self._pointer)
self._pointer = NULL
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20141209/ed044f06/attachment.html>
More information about the NumPy-Discussion
mailing list