[Numpy-discussion] Pickling of memory aliasing patterns
Eelco Hoogendoorn
hoogendoorn.eelco at gmail.com
Fri Feb 28 07:00:19 EST 2014
I have been working on a general function caching mechanism, and in doing
so I stumbled upon the following quirck:
@cached
def foo(a,b):
b[0] = 1
return a[0]
a = np.zeros(1)
b = a[:]
print foo(a, b) #computes and returns 1
print foo(a, b) #gets 1 from cache, as it should
a = np.zeros(1) #no aliasing between inputs
b = np.zeros(1)
print foo(a, b) #should compute and return 0 but instead gets 1 from
cache
Fundamentaly, this is because it turns out that the memory aliasing
patterns that arrays may have are lost during pickling. This leads me to
two questions:
1: Is this desirable behavior
2: is this preventable behavior?
It seems to me the answer to the first question is no, and the answer to
the second question is yes.
Here is what I am using at the moment to generate a correct hash under such
circumstances; but unpickling along these lines should be possible too,
methinks. Or am I missing some subtlety as to why something along these
lines couldn't be the default pickling behavior for numpy arrays?
class ndarray_own(object):
def __init__(self, arr):
self.buffer = np.getbuffer(arr)
self.dtype = arr.dtype
self.shape = arr.shape
self.strides = arr.strides
class ndarray_view(object):
def __init__(self, arr):
self.base = arr.base
self.offset = self.base.ctypes.data - arr.ctypes.data #so we
have a view; but where is it?
self.dtype = arr.dtype
self.shape = arr.shape
self.strides = arr.strides
class NumpyDeterministicPickler(DeterministicPickler):
"""
Special case for numpy.
in general, external C objects may include internal state which does
not serialize in a way we want it to
ndarray memory aliasing is one of those things
"""
def save(self, obj):
"""
remap a numpy array to a representation which conserves
all semantically relevant information concerning memory aliasing
note that this mapping is 'destructive'; we will not get our
original numpy arrays
back after unpickling; not without custom deserialization code at
least
but we dont care, since this is only meant to be used to obtain
correct keying behavior
keys dont need to be deserialized
"""
if isinstance(obj, np.ndarray):
if obj.flags.owndata:
obj = ndarray_own(obj)
else:
obj = ndarray_view(obj)
DeterministicPickler.save(self, obj)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140228/e38ed4de/attachment.html>
More information about the NumPy-Discussion
mailing list