Mailman 3 Should ndarray be a context manager? - NumPy-Discussion

Should ndarray be a context manager?

older
error: no matching function for...

Sturla Molden

9 Dec 2014 9 Dec '14

3:01 p.m.

I wonder if ndarray should be a context manager so we can write something like this: with np.zeros(n) as x: [...] The difference should be that __exit__ should free the memory in x (if owned by x) and make x a zero size array. Unlike the current ndarray, which does not have an __exit__ method, this would give precise control over when the memory is freed. The timing of the memory release would not be dependent on the Python implementation, and a reference cycle or reference leak would not accidentally produce a memory leak. It would allow us to deterministically decide when the memory should be freed, which e.g. is useful when we work with large arrays. A problem with this is that the memory in the ndarray would be volatile with respect to other Python threads and view arrays. However, there are dozens of other ways to produce segfaults or buffer overflows with NumPy (cf. stride_tricks or wrapping external buffers). Below is a Cython class that does something similar, but we would need to e.g. write something like with Heapmem(n * np.double().itemsize) as hm: x = hm.doublearray [...] instead of just with np.zeros(n) as x: [...] Sturla # (C) 2014 Sturla Molden from cpython cimport PyMem_Malloc, PyMem_Free from libc.string cimport memset cimport numpy as cnp cnp.init_array() cdef class Heapmem: cdef: void *_pointer cnp.intp_t _size def __cinit__(Heapmem self, Py_ssize_t n): self._pointer = NULL self._size = n def __init__(Heapmem self, Py_ssize_t n): self.allocate() def allocate(Heapmem self): if self._pointer != NULL: raise RuntimeError("Memory already allocated") else: self._pointer = PyMem_Malloc(self._size) if (self._pointer == NULL): raise MemoryError() memset(self._pointer, 0, self._size) def __dealloc__(Heapmem self): if self._pointer != NULL: PyMem_Free(self._pointer) self._pointer = NULL property pointer: def __get__(Heapmem self): return self._pointer property doublearray: def __get__(Heapmem self): cdef cnp.intp_t n = self._size//sizeof(double) if self._pointer != NULL: return cnp.PyArray_SimpleNewFromData(1, &n, cnp.NPY_DOUBLE, self._pointer) else: raise RuntimeError("Memory not allocated") property chararray: def __get__(Heapmem self): if self._pointer != NULL: return cnp.PyArray_SimpleNewFromData(1, &self._size, cnp.NPY_CHAR, self._pointer) else: raise RuntimeError("Memory not allocated") def __enter__(self): if self._pointer != NULL: raise RuntimeError("Memory not allocated") def __exit__(Heapmem self, type, value, traceback): if self._pointer != NULL: PyMem_Free(self._pointer) self._pointer = NULL

Show replies by date

Eelco Hoogendoorn

9 Dec 9 Dec

4:05 p.m.

My impression is that this level of optimization does and should not fall within the scope of numpy.. -----Original Message----- From: "Sturla Molden" Sent: ‎9-‎12-‎2014 16:02 To: "numpy-discussion@scipy.org" Subject: [Numpy-discussion] Should ndarray be a context manager? I wonder if ndarray should be a context manager so we can write something like this: with np.zeros(n) as x: [...] The difference should be that __exit__ should free the memory in x (if owned by x) and make x a zero size array. Unlike the current ndarray, which does not have an __exit__ method, this would give precise control over when the memory is freed. The timing of the memory release would not be dependent on the Python implementation, and a reference cycle or reference leak would not accidentally produce a memory leak. It would allow us to deterministically decide when the memory should be freed, which e.g. is useful when we work with large arrays. A problem with this is that the memory in the ndarray would be volatile with respect to other Python threads and view arrays. However, there are dozens of other ways to produce segfaults or buffer overflows with NumPy (cf. stride_tricks or wrapping external buffers). Below is a Cython class that does something similar, but we would need to e.g. write something like with Heapmem(n * np.double().itemsize) as hm: x = hm.doublearray [...] instead of just with np.zeros(n) as x: [...] Sturla # (C) 2014 Sturla Molden from cpython cimport PyMem_Malloc, PyMem_Free from libc.string cimport memset cimport numpy as cnp cnp.init_array() cdef class Heapmem: cdef: void *_pointer cnp.intp_t _size def __cinit__(Heapmem self, Py_ssize_t n): self._pointer = NULL self._size = n def __init__(Heapmem self, Py_ssize_t n): self.allocate() def allocate(Heapmem self): if self._pointer != NULL: raise RuntimeError("Memory already allocated") else: self._pointer = PyMem_Malloc(self._size) if (self._pointer == NULL): raise MemoryError() memset(self._pointer, 0, self._size) def __dealloc__(Heapmem self): if self._pointer != NULL: PyMem_Free(self._pointer) self._pointer = NULL property pointer: def __get__(Heapmem self): return self._pointer property doublearray: def __get__(Heapmem self): cdef cnp.intp_t n = self._size//sizeof(double) if self._pointer != NULL: return cnp.PyArray_SimpleNewFromData(1, &n, cnp.NPY_DOUBLE, self._pointer) else: raise RuntimeError("Memory not allocated") property chararray: def __get__(Heapmem self): if self._pointer != NULL: return cnp.PyArray_SimpleNewFromData(1, &self._size, cnp.NPY_CHAR, self._pointer) else: raise RuntimeError("Memory not allocated") def __enter__(self): if self._pointer != NULL: raise RuntimeError("Memory not allocated") def __exit__(Heapmem self, type, value, traceback): if self._pointer != NULL: PyMem_Free(self._pointer) self._pointer = NULL _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Julian Taylor

5:39 p.m.

I don't think that makes much sense, context managers are useful for managing the lifetime of objects owning resources not already managed by the garbage collector. E.g. file descriptors, a gc has no clue that a piece of memory contains a descriptor and thus never has a reason to release it in time when there is plenty of memory available. Memory on the other hand is the resource a gc manages, so it should release objects when memory pressure is high. Also numpy only supports CPython so we don't even need to care about that. A context manager will also not help you with reference cycles. On 09.12.2014 16:01, Sturla Molden wrote:

...

I wonder if ndarray should be a context manager so we can write something like this:

with np.zeros(n) as x: [...]

The difference should be that __exit__ should free the memory in x (if owned by x) and make x a zero size array.

Unlike the current ndarray, which does not have an __exit__ method, this would give precise control over when the memory is freed. The timing of the memory release would not be dependent on the Python implementation, and a reference cycle or reference leak would not accidentally produce a memory leak. It would allow us to deterministically decide when the memory should be freed, which e.g. is useful when we work with large arrays.

A problem with this is that the memory in the ndarray would be volatile with respect to other Python threads and view arrays. However, there are dozens of other ways to produce segfaults or buffer overflows with NumPy (cf. stride_tricks or wrapping external buffers).

Below is a Cython class that does something similar, but we would need to e.g. write something like

with Heapmem(n * np.double().itemsize) as hm: x = hm.doublearray [...]

instead of just

with np.zeros(n) as x: [...]

Sturla

# (C) 2014 Sturla Molden

from cpython cimport PyMem_Malloc, PyMem_Free from libc.string cimport memset cimport numpy as cnp cnp.init_array()

cdef class Heapmem:

cdef: void *_pointer cnp.intp_t _size

def __cinit__(Heapmem self, Py_ssize_t n): self._pointer = NULL self._size = n

def __init__(Heapmem self, Py_ssize_t n): self.allocate()

def allocate(Heapmem self): if self._pointer != NULL: raise RuntimeError("Memory already allocated") else: self._pointer = PyMem_Malloc(self._size) if (self._pointer == NULL): raise MemoryError() memset(self._pointer, 0, self._size)

def __dealloc__(Heapmem self): if self._pointer != NULL: PyMem_Free(self._pointer) self._pointer = NULL

property pointer: def __get__(Heapmem self): return self._pointer

property doublearray: def __get__(Heapmem self): cdef cnp.intp_t n = self._size//sizeof(double) if self._pointer != NULL: return cnp.PyArray_SimpleNewFromData(1, &n, cnp.NPY_DOUBLE, self._pointer) else: raise RuntimeError("Memory not allocated")

property chararray: def __get__(Heapmem self): if self._pointer != NULL: return cnp.PyArray_SimpleNewFromData(1, &self._size, cnp.NPY_CHAR, self._pointer) else: raise RuntimeError("Memory not allocated")

def __enter__(self): if self._pointer != NULL: raise RuntimeError("Memory not allocated")

def __exit__(Heapmem self, type, value, traceback): if self._pointer != NULL: PyMem_Free(self._pointer) self._pointer = NULL

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Sturla Molden

5:55 p.m.

On 09/12/14 18:39, Julian Taylor wrote:

...

A context manager will also not help you with reference cycles.

If will because __exit__ is always executed. Even if the PyArrayObject struct lingers, the data buffer will be released. Sturla

Julian Taylor

5:57 p.m.

On 09.12.2014 18:55, Sturla Molden wrote:

...

On 09/12/14 18:39, Julian Taylor wrote:

...
A context manager will also not help you with reference cycles.

If will because __exit__ is always executed. Even if the PyArrayObject struct lingers, the data buffer will be released.

a exit function would not delete the buffer, only decrease the reference count of the array. If something else still holds a reference it stays valid. Otherwise you would end up with a crash when the other object holding a reference tries to access it.

Robert Kern

6:39 p.m.

On Tue, Dec 9, 2014 at 5:57 PM, Julian Taylor wrote:

...

On 09.12.2014 18:55, Sturla Molden wrote:

...
On 09/12/14 18:39, Julian Taylor wrote:

...
A context manager will also not help you with reference cycles.

If will because __exit__ is always executed. Even if the PyArrayObject struct lingers, the data buffer will be released.

a exit function would not delete the buffer, only decrease the reference count of the array. If something else still holds a reference it stays valid. Otherwise you would end up with a crash when the other object holding a reference tries to access it.

I believe that Sturla is proposing that the buffer (the data pointer) will indeed be free()ed and the ndarray object be modified in-place to have an empty shape. Most references won't matter, because they are opaque; e.g. a frame being held by a caught traceback somewhere or whatever. These are frequently the references that keep alive large arrays when we don't them to, and are hard to track down. The only place where you will get a crasher is when other ndarray views on the original array are still around because those are not opaque references. The main problem I have is that this is much too likely to cause a segfault to be part of the main API for ndarrays. I perhaps wouldn't mind a non-public-API function hidden in numpy somewhere (but not in numpy.* or even numpy.*.*) that did this. The user would put it into a finally: clause, if that such things matters to them, instead of using it as a context manager. Like as_strided(), which creates similar potential for crashes, it should be not be casually available. This kind of action needs to be an explicit statement yes_i_dont_need_this_memory_anymore_references_be_damned() rather than implicitly hidden behind generic syntax. -- Robert Kern

Chris Barker

8:15 p.m.

On Tue, Dec 9, 2014 at 7:01 AM, Sturla Molden wrote:

...

I wonder if ndarray should be a context manager so we can write something like this:

with np.zeros(n) as x: [...]

The difference should be that __exit__ should free the memory in x (if owned by x) and make x a zero size array.

my first thought iust that you can just do: x = np.zeros(n) [... your code here ] del x x's ref count will go down, and it will be deleted if there are no other references to it. If there Are other references to it, you really wouldn't want to delete the memory buffer anyway, would you? At it happens cPython's reference counting scheme DOES enforce deletion at determinate times. I suppose you could write a generic context manger that would do the del for you, but I'm not sure what the point would be. Note that id numpy were to do this, then there would need to be machinery in place to check for null data blocks in a numpy array -- kind of like how a file object can close the underlying file pointer and not crash if someone tries to use it again. I guess this comes down to -- why would anyone want/need a numpy array object with no underlying data? (although I'm still confused as to why it's so important (in cPython) to have a file context manager..) -CHB

...

Unlike the current ndarray, which does not have an __exit__ method, this would give precise control over when the memory is freed. The timing of the memory release would not be dependent on the Python implementation, and a reference cycle or reference leak would not accidentally produce a memory leak. It would allow us to deterministically decide when the memory should be freed, which e.g. is useful when we work with large arrays.

A problem with this is that the memory in the ndarray would be volatile with respect to other Python threads and view arrays. However, there are dozens of other ways to produce segfaults or buffer overflows with NumPy (cf. stride_tricks or wrapping external buffers).

Below is a Cython class that does something similar, but we would need to e.g. write something like

with Heapmem(n * np.double().itemsize) as hm: x = hm.doublearray [...]

instead of just

with np.zeros(n) as x: [...]

Sturla

# (C) 2014 Sturla Molden

from cpython cimport PyMem_Malloc, PyMem_Free from libc.string cimport memset cimport numpy as cnp cnp.init_array()

cdef class Heapmem:

cdef: void *_pointer cnp.intp_t _size

def __cinit__(Heapmem self, Py_ssize_t n): self._pointer = NULL self._size = n

def __init__(Heapmem self, Py_ssize_t n): self.allocate()

def allocate(Heapmem self): if self._pointer != NULL: raise RuntimeError("Memory already allocated") else: self._pointer = PyMem_Malloc(self._size) if (self._pointer == NULL): raise MemoryError() memset(self._pointer, 0, self._size)

def __dealloc__(Heapmem self): if self._pointer != NULL: PyMem_Free(self._pointer) self._pointer = NULL

property pointer: def __get__(Heapmem self): return self._pointer

property doublearray: def __get__(Heapmem self): cdef cnp.intp_t n = self._size//sizeof(double) if self._pointer != NULL: return cnp.PyArray_SimpleNewFromData(1, &n, cnp.NPY_DOUBLE, self._pointer) else: raise RuntimeError("Memory not allocated")

property chararray: def __get__(Heapmem self): if self._pointer != NULL: return cnp.PyArray_SimpleNewFromData(1, &self._size, cnp.NPY_CHAR, self._pointer) else: raise RuntimeError("Memory not allocated")

def __enter__(self): if self._pointer != NULL: raise RuntimeError("Memory not allocated")

def __exit__(Heapmem self, type, value, traceback): if self._pointer != NULL: PyMem_Free(self._pointer) self._pointer = NULL

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Robert Kern

8:31 p.m.

On Tue, Dec 9, 2014 at 8:15 PM, Chris Barker wrote:

...

(although I'm still confused as to why it's so important (in cPython) to have a file context manager..)

Because you want the file to close when the exception is raised and not at some indeterminate point thereafter when the traceback stack frames finally get disposed of, which can be an indefinitely long time. -- Robert Kern

Sturla Molden

9:04 p.m.

Chris Barker wrote:

...

my first thought iust that you can just do:

x = np.zeros(n) [... your code here ] del x

x's ref count will go down, and it will be deleted if there are no other references to it.

1. This depends on reference counting. PyPy supports numpy too (albeit with its own code) and does not reference count. 2. del does not delete, it just decrements the refcount. x can still be kept alive, 3. If x is a part of a reference cycle it is reclaimed later on.

...

If there Are other references to it, you really wouldn't want to delete the memory buffer anyway, would you?

Same thing for file descriptors. For example consider what happens if you memory map a file, then close the file, but continue to read and write to the mapped address. NumPy allows us to construct these circumstances if we want to.

...

I suppose you could write a generic context manger that would do the del for you, but I'm not sure what the point would be.

A del is very different from a deallocation that actually disposes of the data buffer, regardless of references to the memory that might still be alive.

...

I guess this comes down to -- why would anyone want/need a numpy array object with no underlying data?

I don't. The PyArrayObject struct is so small that I don't care about it. But it could reference a huge data buffer, and I might want to get rid of that more deterministically than just waiting for the gc.

...

(although I'm still confused as to why it's so important (in cPython) to have a file context manager..)

Because we often want to run setup and teardown code deterministically, rather than e.g. having it happen at random from the gc thread when it runs the finalizer. If Python raises an exception, a io.file object can be kept alive by the traceback for decades. If Python raises an exception, a an acquire/release pair for a threading.Lock can be separated, and the lock ends up in an undefined state further down in your code. In what I suggested the setup and teardown code would be malloc() and free(). Sturla

Nathaniel Smith

11:59 p.m.

On 9 Dec 2014 15:03, "Sturla Molden" wrote:

...

I wonder if ndarray should be a context manager so we can write something like this:

with np.zeros(n) as x: [...]

The difference should be that __exit__ should free the memory in x (if owned by x) and make x a zero size array.

Regardless of whether this functionality is provided as part of numpy, I don't much like the idea of putting __enter__ and __exit__ methods on ndarray itself. It's just very confusing - I had no idea what 'with arr' would mean when I saw the thread subject. It's much clearer and easier to document if one uses a special context manager just for this, like: with tmp_zeros(...) as arr: ... This should be pretty trivial to implement. AFAICT you don't need any complicated cython, you just need: @contextmanager def tmp_zeros(*args, **kwargs): arr = np.zeros(*args, **kwargs) try: yield arr finally: arr.resize((0,), check_refs=False) Given how intrinsically dangerous this is, and how easily it can be implemented using numpy's existing public API, I think maybe we should leave this for third-party daredevils instead of implementing it in numpy proper. -n

Sturla Molden

10 Dec 10 Dec

7:03 a.m.

Nathaniel Smith wrote:

...

@contextmanager def tmp_zeros(*args, **kwargs): arr = np.zeros(*args, **kwargs) try: yield arr finally: arr.resize((0,), check_refs=False)

That one is interesting. I have actually never used ndarray.resize(). It did not even occur to me that such an abomination existed :-) Sturla

Chris Barker

7:36 p.m.

On Tue, Dec 9, 2014 at 11:03 PM, Sturla Molden wrote:

...

Nathaniel Smith wrote:

...
@contextmanager def tmp_zeros(*args, **kwargs): arr = np.zeros(*args, **kwargs) try: yield arr finally: arr.resize((0,), check_refs=False)

That one is interesting. I have actually never used ndarray.resize(). It did not even occur to me that such an abomination existed :-)

and I thought that it would only work if there were no other references to the array, in which case it gets garbage collected anyway, but I see the nifty check_refs keyword. However: In [32]: arr = np.ones((100,100)) In [33]: arr.resize((0,), check_refs=False) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-33-f0e634534904> in <module>() ----> 1 arr.resize((0,), check_refs=False) TypeError: 'check_refs' is an invalid keyword argument for this function In [34]: np.__version__ Out[34]: '1.9.1' Was that just added (or removed?) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Eric Moore

7:41 p.m.

The second argument is named `refcheck` rather than check_refs. Eric On Wed, Dec 10, 2014 at 2:36 PM, Chris Barker wrote:

...

On Tue, Dec 9, 2014 at 11:03 PM, Sturla Molden wrote:

...
Nathaniel Smith wrote:

...
@contextmanager def tmp_zeros(*args, **kwargs): arr = np.zeros(*args, **kwargs) try: yield arr finally: arr.resize((0,), check_refs=False)

That one is interesting. I have actually never used ndarray.resize(). It did not even occur to me that such an abomination existed :-)

and I thought that it would only work if there were no other references to the array, in which case it gets garbage collected anyway, but I see the nifty check_refs keyword. However:

In [32]: arr = np.ones((100,100))

In [33]: arr.resize((0,), check_refs=False) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-33-f0e634534904> in <module>() ----> 1 arr.resize((0,), check_refs=False)

TypeError: 'check_refs' is an invalid keyword argument for this function

In [34]: np.__version__ Out[34]: '1.9.1'

Was that just added (or removed?)

-Chris

--

Christopher Barker, Ph.D. Oceanographer

Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@noaa.gov

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Andrea Gavana

7:44 p.m.

On 10 December 2014 at 20:36, Chris Barker wrote:

...

On Tue, Dec 9, 2014 at 11:03 PM, Sturla Molden wrote:

...
Nathaniel Smith wrote:

...
@contextmanager def tmp_zeros(*args, **kwargs): arr = np.zeros(*args, **kwargs) try: yield arr finally: arr.resize((0,), check_refs=False)

That one is interesting. I have actually never used ndarray.resize(). It did not even occur to me that such an abomination existed :-)

and I thought that it would only work if there were no other references to the array, in which case it gets garbage collected anyway, but I see the nifty check_refs keyword. However:

In [32]: arr = np.ones((100,100))

In [33]: arr.resize((0,), check_refs=False) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-33-f0e634534904> in <module>() ----> 1 arr.resize((0,), check_refs=False)

TypeError: 'check_refs' is an invalid keyword argument for this function

In [34]: np.__version__ Out[34]: '1.9.1'

Was that just added (or removed?)

http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.resize.htm... The argument is not check_refs, but refcheck. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://www.infinity77.net # ------------------------------------------------------------- # def ask_mailing_list_support(email): if mention_platform_and_version() and include_sample_app(): send_message(email) else: install_malware() erase_hard_drives() # ------------------------------------------------------------- #

Chris Barker

9:03 p.m.

On Wed, Dec 10, 2014 at 11:44 AM, Andrea Gavana wrote:

...

The argument is not check_refs, but refcheck.

thanks -- yup, that works. Useful -- but dangerous! I haven't managed to trigger a segfault yet but it sure looks like I could... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Nathaniel Smith

11:34 p.m.

On Wed, Dec 10, 2014 at 9:03 PM, Chris Barker wrote:

...

On Wed, Dec 10, 2014 at 11:44 AM, Andrea Gavana wrote:

...
The argument is not check_refs, but refcheck.

thanks -- yup, that works.

Useful -- but dangerous!

I haven't managed to trigger a segfault yet but it sure looks like I could...

On Linux at least this should work reliably: In [1]: a = np.zeros(2 ** 20) In [2]: b = a[...] In [3]: a.resize((0,), refcheck=False) In [4]: b[1000] = 1 zsh: segmentation fault ipython -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org

Sturla Molden

11 Dec 11 Dec

4:40 a.m.

Chris Barker wrote:

...

I haven't managed to trigger a segfault yet but it sure looks like I could...

You can also trigger random errors. If the array is small, Python's memory mamager might keep the memory in the heap for reuse by PyMem_Malloc. And then you can actually modify some random Python object with bogus bits. It can screw things up even if you do not see a segfault. Sturla

Chris Barker

9:30 p.m.

On Wed, Dec 10, 2014 at 8:40 PM, Sturla Molden wrote:

...

...
I haven't managed to trigger a segfault yet but it sure looks like I could...

You can also trigger random errors. If the array is small, Python's memory mamager might keep the memory in the heap for reuse by PyMem_Malloc. And then you can actually modify some random Python object with bogus bits. It can screw things up even if you do not see a segfault.

I should have said that -- I did see random garbage -- just not an actual segfault -- probably didn't allocate a large enough array for the system to reclaim that memory. Anyway, the point is that if we wanted this to be a used-more-than-very-rarely in only very special cases feature de-allocating an array's data buffer), then ndarray would need to grow a check for an invalid buffer on access. I have no idea if that would make for a noticeable performance hit. -Chris

...

--

Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Sturla Molden

12 Dec 12 Dec

8:18 a.m.

Chris Barker wrote:

...

Anyway, the point is that if we wanted this to be a used-more-than-very-rarely in only very special cases feature de-allocating an array's data buffer), then ndarray would need to grow a check for an invalid buffer on access.

One would probably need something like Java JNI where an array can be "locked" by the C code. But nevertheless, what I suggested is not inherently worse than this C++ code: { std::vector<double> a(n); // do something with a } // references to std::vector a might still exist In C this is often known as the "dangling pointer" problem. Sturla

Sturla Molden

10 Dec 10 Dec

7:25 a.m.

Nathaniel Smith wrote:

...

This should be pretty trivial to implement. AFAICT you don't need any complicated cython

I have a bad habit of thinking in terms of too complicated C instead of just using NumPy.

...

@contextmanager def tmp_zeros(*args, **kwargs): arr = np.zeros(*args, **kwargs) try: yield arr finally: arr.resize((0,), check_refs=False)

Given how intrinsically dangerous this is, and how easily it can be implemented using numpy's existing public API, I think maybe we should leave this for third-party daredevils instead of implementing it in numpy proper.

It seems so :-) Sturla

Sebastian Berg

9:04 a.m.

On Mi, 2014-12-10 at 07:25 +0000, Sturla Molden wrote:

...

Nathaniel Smith wrote:

...
This should be pretty trivial to implement. AFAICT you don't need any complicated cython

I have a bad habit of thinking in terms of too complicated C instead of just using NumPy.

...
@contextmanager def tmp_zeros(*args, **kwargs): arr = np.zeros(*args, **kwargs) try: yield arr finally: arr.resize((0,), check_refs=False)

Given how intrinsically dangerous this is, and how easily it can be implemented using numpy's existing public API, I think maybe we should leave this for third-party daredevils instead of implementing it in numpy proper.

It seems so :-)

Completly agree, we may tell the user where the gun is, but we shouldn't put it in their hand and then point it at their feet as well ;). - Sebastian

...

Sturla

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

3421

Age (days ago)

3424

Last active (days ago)

List overview

Download

20 comments

9 participants

participants (9)

Andrea Gavana
Chris Barker
Eelco Hoogendoorn
Eric Moore
Julian Taylor
Nathaniel Smith
Robert Kern
Sebastian Berg
Sturla Molden

Should ndarray be a context manager?

Sturla Molden

Sturla Molden

Sturla Molden

Sturla Molden

Sturla Molden

Sturla Molden

Sturla Molden

tags

participants (9)