[Numpy-discussion] Numpy-discussion Digest, Vol 4, Issue 84

James A. Bednar jbednar at inf.ed.ac.uk
Mon Jan 29 20:29:22 EST 2007


|  Date: Mon, 29 Jan 2007 15:55:06 -0700
|  From: Travis Oliphant <oliphant at ee.byu.edu>
|  
|  James A. Bednar wrote:
|  
|  >Hi,
|  >
|  >Does anyone know whether it is possible to pickle and unpickle numpy
|  >ufuncs?
|  >
|  Not directly.   Ufuncs are a built-in type and do not have the required 
|  __reduce__ method needed to be pickleable.   It could be added, but 
|  hasn't been.

Thanks for the quick reply!  Please consider our request that it be
added.  

Meanwhile, we'll do a workaround.  We just wanted to make sure that we
were not overlooking some obvious way we were supposed to be pickling
them, such as through some special pickler hidden somewhere. :-)

|  >  I can't find anything about that on scipy.org or the mailing
|  >list archives.  I have several important pieces of code that accept a
|  >numpy ufunc as an argument and later apply it to some data, while
|  >keeping a copy of the ufunc in an attribute.  So far I have been able
|  >to pickle this code only by doing some very ugly hacks and
|  >workarounds.
|
|  Is storing the name considered an ugly hack?

Yes -- see below.

|  >Here's an example that illustrates the problem:
|  >
|  >
|  >I have a class Test defined in test.py:
|  >
|  >class Test(object):
|  >    def __init__(self,a):
|  >        self.a = a
|
|  Why don't you store the name of the ufunc instead:
|  
|  def __init__(self, a):
|        self._a = a.__name__
|  
|  Then, whenever you are going to use the ufunc you do
|  
|  import numpy
|  func = getattr(numpy,self._a)
|  
|  Then, pickle should work.

For numpy ufuncs alone this would work, but in general we want to
support any ufunc-like operator, including ones that our API users
define, not just those defined in the numpy namespace.  Specifically,
we have numerous classes that have a user-selectable "operator"
attribute that is *typically* set to a numpy ufunc like add or max.
But a user can pass anything he or she wants as an operator, such as
return_first(), where that's defined as:

  class return_first(object):
    @staticmethod
    def reduce(x):
        return x[0]

Anything that supports reduce() (as ufuncs do) will work for our
purposes, but if we put in specific hacks for numpy, then our users
will be limited to only the numpy ufuncs.  Of course, we can add more
hacks to limit the hacks to being applied only for ufuncs, but we've
done something vaguely like this and it's pretty complicated.  We just
want to be able to treat ufuncs like any other function object in
Python...

|  Alternatively you can write your own __reduce__ function for the Test 
|  class.

We can do that too, but we actually store such operators in numerous
locations in different classes, so that too turns out to be
complicated.  

|  Direct pickling of ufuncs is not a trivial issue as ufuncs contain
|  code-pointers at their core which cannot really be pickled.  It's
|  not a simple problem to over-come in general.  We could store the
|  name of the ufunc, but for user-defined ufuncs these might come
|  from a different package and which package the ufunc lives under is
|  not stored as part of the ufunc.

>From a technical perspective it's worked fine for us to simply pickle
the name and then reconstruct the ufunc from the name when
unpickling.  It's just the practical side of how to do that relatively
cleanly and without having to explain it all to our API users that's
been a problem.  And Robert's suggestion below looks like it will help
clean that up.

|  From: Robert Kern <robert.kern at gmail.com>
|  Subject: Re: [Numpy-discussion] Pickling ufuncs?
|  
|  Travis Oliphant wrote:
|  > Why don't you store the name of the ufunc instead:
|  
|  Or you can register pickler/unpickler functions for ufuncs:
|  
|  
|  In [24]: import copy_reg
|  
|  In [25]: import numpy
|  
|  In [26]: def ufunc_pickler(ufunc):
|      return ufunc.__name__
|     ....:
|  
|  In [28]: def ufunc_unpickler(name):
|      import numpy
|      return getattr(numpy, name)
|     ....:
|  
|  In [31]: copy_reg.pickle(numpy.ufunc, ufunc_pickler, ufunc_unpickler)
|  
|  In [32]: import cPickle
|  
|  In [33]: cPickle.dumps(numpy.add)
|  Out[33]: 'cnumpy.core.umath\nadd\np1\n.'
|  
|  In [34]: cPickle.loads(cPickle.dumps(numpy.add))
|  Out[34]: <ufunc 'add'>
|  
|  
|  Note that this is a hack. It won't work for the ufuncs in scipy.special, for
|  example.

That sounds like a significantly cleaner hack than the ones we've been
doing, so I think we'll try that out.  Thanks!

(I think we actually tried this years ago with Numeric, but from the
above example it sounds worth trying again in numpy.)

|  We should look into including __module__ information in ufuncs. This is how
|  regular functions are pickled.

I vote for that!

Thanks,

Jim



More information about the NumPy-Discussion mailing list