Bug in pickling an ndarray?
![](https://secure.gravatar.com/avatar/8d8dd2bae4e70ff76eed69107dcf217e.jpg?s=120&d=mm&r=g)
I am having trouble pickling (and then unpickling) an ndarray. Upon unpickling, the "base" attribute of the ndarray is set to some very strange string ("base" was None when the ndarray was pickled, so it should remain None). I have tried on various platforms and versions of numpy, with inconclusive results: # tested: Linux (Suse 11.1), numpy 1.5.1 BUG # Linux (Suse 11,0), numpy 1.6.1 OK # Linux (Mint Debian), numpy 1.6.1 BUG # Linux (Mint Debian), numpy 1.6.2 BUG # OSX (Snow Leopard), numpy 1.5.1rc1 BUG # OSX (Snow Leopard), numpy 1.6.2 BUG # Windows 7, numpy 1.4.1 OK I have attached a script below that can be used to check for the problem; I suppose that this is a bug report, unless I'm doing something terribly wrong or my expectations for the base attribute are off. ---------------- cut here --------------------------------- # this little demo shows a problem with the base attribute of an ndarray, when # pickling. Before pickling, dset.base is None, but after pickling, it is some # strange string. import cPickle as pickle import numpy print numpy.__version__ #import pickle dset = numpy.ones((2,2)) print "BEFORE PICKLING" print dset print "base = ",dset.base print dset.flags # pickle. s = pickle.dumps(dset) # now unpickle. dset = pickle.loads(s) print "AFTER PICKLING AND THEN IMMEDIATELY UNPICKLING" print dset print "base = ",dset.base print dset.flags -- Daniel Hyams dhyams@gmail.com
![](https://secure.gravatar.com/avatar/97c543aca1ac7bbcfb5279d0300c8330.jpg?s=120&d=mm&r=g)
On Sat, Jun 30, 2012 at 9:15 PM, Daniel Hyams <dhyams@gmail.com> wrote:
I am having trouble pickling (and then unpickling) an ndarray. Upon unpickling, the "base" attribute of the ndarray is set to some very strange string ("base" was None when the ndarray was pickled, so it should remain None).
This sounds like correct behaviour to me -- is it causing you a problem? In general ndarray's don't keep things like memory layout, view sharing, etc. through pickling, and that means that things like .flags and .base may change. -n
![](https://secure.gravatar.com/avatar/8d8dd2bae4e70ff76eed69107dcf217e.jpg?s=120&d=mm&r=g)
Hmmm, I wouldn't think that it is correct behavior; I would think that *any* ndarray arising from pickling would have its .base attribute set to None. If not, then who is really the one that owns the data? It was my understanding that .base should hold a reference to another ndarray that the data is really coming from, or it's None. It certainly shouldn't be some random string, should it? And yes, it is causing a problem for me, which is why I noticed it. In my application, ndarrays can come from various sources, pickling being one of them. Later in the app, I was wanting to resize the array, which you cannot do if the data is not really owned by that array...I had explicit check for myarray.base==None, which it is not when I get the ndarray from a pickle. -- Daniel Hyams dhyams@gmail.com
![](https://secure.gravatar.com/avatar/6c8561779fff34c62074c614d19980fc.jpg?s=120&d=mm&r=g)
This is the expected behavior. It is not a bug. NumPy arrays after pickling are views into the String that is created by the pickling machinery. Thus, the base is set. This was done to avoid an additional memcpy. This avoids a copy, but yes, it does mean that you can't resize the array until you make another copy. Best regards, -Travis On Jun 30, 2012, at 5:33 PM, Daniel Hyams wrote:
Hmmm, I wouldn't think that it is correct behavior; I would think that *any* ndarray arising from pickling would have its .base attribute set to None. If not, then who is really the one that owns the data?
It was my understanding that .base should hold a reference to another ndarray that the data is really coming from, or it's None. It certainly shouldn't be some random string, should it?
And yes, it is causing a problem for me, which is why I noticed it. In my application, ndarrays can come from various sources, pickling being one of them. Later in the app, I was wanting to resize the array, which you cannot do if the data is not really owned by that array...I had explicit check for myarray.base==None, which it is not when I get the ndarray from a pickle.
-- Daniel Hyams dhyams@gmail.com _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
On Sat, Jun 30, 2012 at 11:33 PM, Daniel Hyams <dhyams@gmail.com> wrote:
Hmmm, I wouldn't think that it is correct behavior; I would think that *any* ndarray arising from pickling would have its .base attribute set to None. If not, then who is really the one that owns the data?
It was my understanding that .base should hold a reference to another ndarray that the data is really coming from, or it's None. It certainly shouldn't be some random string, should it?
It can be any object that will keep the data memory alive while the object is kept alive. It does not have to be an ndarray. In this case, the numpy unpickling constructor takes the string object that the underlying pickling machinery has just created and views its memory directly. In order to keep Python from freeing that memory, the string object needs to be kept alive via a reference, so it gets assigned to the .base.
And yes, it is causing a problem for me, which is why I noticed it. In my application, ndarrays can come from various sources, pickling being one of them. Later in the app, I was wanting to resize the array, which you cannot do if the data is not really owned by that array...
You also can't resize an array if any *other* array has a view on that array too, so checking for ownership isn't going to help. .resize() will raise an exception if it can't do this; it's better to just attempt it and catch the exception than to look before you leap.
I had explicit check for myarray.base==None, which it is not when I get the ndarray from a pickle.
That is not the way to check if an ndarray owns its data. Instead, check a.flags['OWNDATA'] -- Robert Kern
![](https://secure.gravatar.com/avatar/8d8dd2bae4e70ff76eed69107dcf217e.jpg?s=120&d=mm&r=g)
Thanks Travis and Robert for the clarification; it is much more clear what is going on now. As the demo code shows, also a.flags['OWNDATA'] is different on its way out of the pickle; which also makes sense now. So using that flag instead of checking a.base for None is equivalent, at least in this situation. So is it a bug, then, that, on Windows, .base is set to None (of course, this may be something that was fixed in later versions of numpy; I was only able to test Windows with numpy 1.4.1). I'll just make a copy and discard the original to work around the situation (which is what I already had done, but the inconsistent behavior across versions and platforms made me think it was a bug). Thanks again for the clear explanation of what is going on. On Sat, Jun 30, 2012 at 6:33 PM, Daniel Hyams <dhyams@gmail.com> wrote:
Hmmm, I wouldn't think that it is correct behavior; I would think that *any* ndarray arising from pickling would have its .base attribute set to None. If not, then who is really the one that owns the data?
It was my understanding that .base should hold a reference to another ndarray that the data is really coming from, or it's None. It certainly shouldn't be some random string, should it?
And yes, it is causing a problem for me, which is why I noticed it. In my application, ndarrays can come from various sources, pickling being one of them. Later in the app, I was wanting to resize the array, which you cannot do if the data is not really owned by that array...I had explicit check for myarray.base==None, which it is not when I get the ndarray from a pickle.
-- Daniel Hyams dhyams@gmail.com
-- Daniel Hyams dhyams@gmail.com
participants (4)
-
Daniel Hyams
-
Nathaniel Smith
-
Robert Kern
-
Travis Oliphant