Hi, Does anyone know whether it is possible to pickle and unpickle numpy ufuncs? I can't find anything about that on scipy.org or the mailing list archives. I have several important pieces of code that accept a numpy ufunc as an argument and later apply it to some data, while keeping a copy of the ufunc in an attribute. So far I have been able to pickle this code only by doing some very ugly hacks and workarounds. The ufuncs from Numeric 24 and earlier did not even deepcopy, which caused us lots of other problems, but deepcopying them works now in numpy 1.0.1. However, they still don't seem to pickle with the regular picklers. Is this deliberately disabled for some reason, or is there some workaround? Here's an example that illustrates the problem: I have a class Test defined in test.py: class Test(object): def __init__(self,a): self.a = a If I try this code at the commandline: import numpy from test import Test t=Test(numpy.multiply) import pickle s=pickle.dumps(t) I get "TypeError: can't pickle ufunc objects" Also, I can't reproduce it in a small example yet, but pickling completes without errors in our larger program that also stores ufuncs, yet I then get a different error on unpickling: File "/disk/home/lodestar1/jbednar/topographica/lib/python2.4/pickle.py", line 872, in load dispatch[key](self) File "/disk/home/lodestar1/jbednar/topographica/lib/python2.4/pickle.py", line 1097, in load_newobj obj = cls.__new__(cls, *args) TypeError: object.__new__(numpy.ufunc) is not safe, use numpy.ufunc.__new__() Any ideas? Thanks, Jim _______________________________________________________________________________ Unix> cat test.py class Test(object): def __init__(self,a): self.a = a Unix> bin/python Python 2.4.4 (#5, Nov 9 2006, 22:58:03) [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import numpy from test import Test t=Test(numpy.multiply) import pickle s=pickle.dumps(t) Traceback (most recent call last): File "<stdin>", line 1, in ? File "/home/jb/lib/python2.4/pickle.py", line 1386, in dumps Pickler(file, protocol, bin).dump(obj) File "/home/jb/lib/python2.4/pickle.py", line 231, in dump self.save(obj) File "/home/jb/lib/python2.4/pickle.py", line 338, in save self.save_reduce(obj=obj, *rv) File "/home/jb/lib/python2.4/pickle.py", line 433, in save_reduce save(state) File "/home/jb/lib/python2.4/pickle.py", line 293, in save f(self, obj) # Call unbound method with explicit self File "/home/jb/lib/python2.4/pickle.py", line 663, in save_dict self._batch_setitems(obj.iteritems()) File "/home/jb/lib/python2.4/pickle.py", line 677, in _batch_setitems save(v) File "/home/jb/lib/python2.4/pickle.py", line 313, in save rv = reduce(self.proto) File "/home/jb/lib/python2.4/copy_reg.py", line 69, in _reduce_ex raise TypeError, "can't pickle %s objects" % base.__name__ TypeError: can't pickle ufunc objects
James A. Bednar wrote:
Hi,
Does anyone know whether it is possible to pickle and unpickle numpy ufuncs?
Not directly. Ufuncs are a built-in type and do not have the required __reduce__ method needed to be pickleable. It could be added, but hasn't been.
I can't find anything about that on scipy.org or the mailing list archives. I have several important pieces of code that accept a numpy ufunc as an argument and later apply it to some data, while keeping a copy of the ufunc in an attribute. So far I have been able to pickle this code only by doing some very ugly hacks and workarounds.
Is storing the name considered an ugly hack?
The ufuncs from Numeric 24 and earlier did not even deepcopy, which caused us lots of other problems, but deepcopying them works now in numpy 1.0.1. However, they still don't seem to pickle with the regular picklers. Is this deliberately disabled for some reason, or is there some workaround?
No, nothing has been "disabled." The feature was never added.
Here's an example that illustrates the problem:
I have a class Test defined in test.py:
class Test(object): def __init__(self,a): self.a = a
Why don't you store the name of the ufunc instead: def __init__(self, a): self._a = a.__name__ Then, whenever you are going to use the ufunc you do import numpy func = getattr(numpy,self._a) Then, pickle should work. Alternatively you can write your own __reduce__ function for the Test class. Direct pickling of ufuncs is not a trivial issue as ufuncs contain code-pointers at their core which cannot really be pickled. It's not a simple problem to over-come in general. We could store the name of the ufunc, but for user-defined ufuncs these might come from a different package and which package the ufunc lives under is not stored as part of the ufunc. -Travis
Travis Oliphant wrote:
Why don't you store the name of the ufunc instead:
def __init__(self, a): self._a = a.__name__
Then, whenever you are going to use the ufunc you do
import numpy func = getattr(numpy,self._a)
Then, pickle should work.
Or you can register pickler/unpickler functions for ufuncs: In [24]: import copy_reg In [25]: import numpy In [26]: def ufunc_pickler(ufunc): return ufunc.__name__ ....: In [28]: def ufunc_unpickler(name): import numpy return getattr(numpy, name) ....: In [31]: copy_reg.pickle(numpy.ufunc, ufunc_pickler, ufunc_unpickler) In [32]: import cPickle In [33]: cPickle.dumps(numpy.add) Out[33]: 'cnumpy.core.umath\nadd\np1\n.' In [34]: cPickle.loads(cPickle.dumps(numpy.add)) Out[34]: <ufunc 'add'> Note that this is a hack. It won't work for the ufuncs in scipy.special, for example. We should look into including __module__ information in ufuncs. This is how regular functions are pickled. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Robert Kern wrote:
Note that this is a hack. It won't work for the ufuncs in scipy.special, for example.
We should look into including __module__ information in ufuncs. This is how regular functions are pickled.
This sounds like a good idea. Would it be possible to add a "dict" object to the end of the ufunc structure without bumping up the Version number of the C-API? I think it would be possible, but I'm interested in other points of view. If we added the dict object then we could add the __module__ attribute to all ufuncs and use that in pickling. Then, we could add an API function that took a module name and found all ufuncs under that name and set their __module__ attribute correctly. Thoughts? -Travis
participants (3)
-
James A. Bednar -
Robert Kern -
Travis Oliphant