[Python-Dev] pickling of large arrays

Ralf W. Grosse-Kunstleve rwgk@yahoo.com
Thu, 20 Feb 2003 07:05:29 -0800 (PST)


--- Scott Gilbert <xscottg@yahoo.com> wrote:
> Have your __reduce__ method return a 4-tuple (function, arguments, state,
> listitems) with:
>  
>   function = a constructing function that takes the length of your array in
> bytes, and the type of the data in the array
> 
>   arguments = a 2-tuple specifying the bytes and type
> 
>   state = None
> 
>   listitems = an iterator that returns small chunks of memory at a time. 

Hey, this is great! I am beginning to see the light.

> I've been gone for a while, is this PEP going to be included in the final
> version of 2.3?

My little prototype below works with Python 2.3a2!

This is almost perfect. In C++ we can have two overloads.
Highly simplified:

template <typename T>
class array<T> {
  void append(T const& value); // the regular append
  void append(std::string const& value); // for unpickling
};

This will work for all T (e.g. int, float, etc.)
... except T == std::string.

This leads me to find it unfortunate that append() is re-used for
unpickling. How about:

  If the object has a (say) __unpickle_append__ method this is used by
  the unpickler instead of append or extend.

Ralf


import pickle

class int_array(object):

  def __init__(self, elems):
    self.elems = list(elems)

  def __reduce__(self):
    return (int_array_factory,
            (len(self.elems),),
            None,
            int_array_iter(self.elems))

  def append(self, value):
    values = [int(x) for x in value.split(",")]
    self.elems.extend(values)

class int_array_iter(object):

  def __init__(self, elems, buf_size=4):
    self.elems = elems
    self.buf_size = 4
    self.i = 0

  def __iter__(self):
    return self

  def next(self):
    if (self.i >= len(self.elems)): raise StopIteration
    result = ""
    for i in xrange(self.buf_size):
      result+= str(self.elems[self.i]) + ","
      self.i += 1
      if (self.i == len(self.elems)): break
    return result[:-1]

def int_array_factory(size):
  print "reserve:", size
  return int_array([])

f = int_array(range(11))
print "f.elems:", f.elems
s = pickle.dumps(f)
print s
g = pickle.loads(s)
print "g.elems:", g.elems


__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, more
http://taxes.yahoo.com/