This is question is related to PEP 307, "Extensions to the pickle protocol", http://www.python.org/peps/pep-0307.html . Apparently the new Pickle "protocol 2" provides a mechanism for avoiding large temporaries, but only for lists and dicts (section "Pickling of large lists and dicts" near the end). I am wondering if the new protocol could also help us to eliminate large temporaries when pickling Boost.Python extension classes. We wrote an open source C++ array library with Boost.Python bindings. For pickling we use the __getstate__, __setstate__ protocol. As it stands pickling involves converting the arrays to Python strings, similar to what is done in Numpy. There are two mechanisms: 1. "single buffered": For numeric types (int, long, double, etc.) a Python string is allocated based on an upper estimate for the required size (PyString_FromStringAndSize). The entire numeric array is converted directly to that string. Finally the Python string is resized (_PyString_Resize). With this mechanism there are 2 copies of the array in memory: - the original array and - the Python string. 2. "double buffered": For some user-defined element types it is very difficult to estimate an upper limit for the size of the string representation. Therefore the array is first converted to a dynamically growing C++ std::string, which is then copied to a Python string. With this mechanism there are 3 copies of the array in memory: - the original array, - the std::string, and - the Python string. For very large arrays the memory overhead can be a limiting factor. Could the new protocol 2 help us in some way? Thank you in advance, Ralf __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - forms, calculators, tips, more http://taxes.yahoo.com/
--- "Ralf W. Grosse-Kunstleve"
This is question is related to PEP 307, "Extensions to the pickle protocol", http://www.python.org/peps/pep-0307.html .
Apparently the new Pickle "protocol 2" provides a mechanism for avoiding large temporaries, but only for lists and dicts (section "Pickling of large lists and dicts" near the end). I am wondering if the new protocol could also help us to eliminate large temporaries when pickling Boost.Python extension classes.
We wrote an open source C++ array library with Boost.Python bindings. For pickling we use the __getstate__, __setstate__ protocol. As it stands pickling involves converting the arrays to Python strings, similar to what is done in Numpy. There are two mechanisms:
1. "single buffered":
For numeric types (int, long, double, etc.) a Python string is allocated based on an upper estimate for the required size (PyString_FromStringAndSize). The entire numeric array is converted directly to that string. Finally the Python string is resized (_PyString_Resize). With this mechanism there are 2 copies of the array in memory: - the original array and - the Python string.
2. "double buffered":
For some user-defined element types it is very difficult to estimate an upper limit for the size of the string representation. Therefore the array is first converted to a dynamically growing C++ std::string, which is then copied to a Python string. With this mechanism there are 3 copies of the array in memory: - the original array, - the std::string, and - the Python string.
For very large arrays the memory overhead can be a limiting factor. Could the new protocol 2 help us in some way?
I hadn't seen this PEP yet. Very nice. I don't know how to solve your second case with it, but it looks like you could solve your first case with very little overhead: Have your __reduce__ method return a 4-tuple (function, arguments, state, listitems) with: function = a constructing function that takes the length of your array in bytes, and the type of the data in the array arguments = a 2-tuple specifying the bytes and type state = None listitems = an iterator that returns small chunks of memory at a time. For instance have it generate your array with 1K or 8K strings at a time. This strategy should avoid having to "double buffer" your array when pickling. The overhead would be the 1 or 8 K that would presumably be reused multiple times while pickling a large array. Your multi-megabyte original array would never have to be copied all at once. For unpickling, you'd have your constructor function allocate all the space in one go, and make the semantics of your append() or extend() method do the right thing. I've been gone for a while, is this PEP going to be included in the final version of 2.3? __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - forms, calculators, tips, more http://taxes.yahoo.com/
--- Scott Gilbert
I've been gone for a while, is this PEP going to be included in the final version of 2.3?
Don't answer that, I just saw the notice for 2.3a2... :-) __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - forms, calculators, tips, more http://taxes.yahoo.com/
--- Scott Gilbert
Have your __reduce__ method return a 4-tuple (function, arguments, state, listitems) with:
function = a constructing function that takes the length of your array in bytes, and the type of the data in the array
arguments = a 2-tuple specifying the bytes and type
state = None
listitems = an iterator that returns small chunks of memory at a time.
Hey, this is great! I am beginning to see the light.
I've been gone for a while, is this PEP going to be included in the final version of 2.3?
My little prototype below works with Python 2.3a2! This is almost perfect. In C++ we can have two overloads. Highly simplified: template <typename T> class array<T> { void append(T const& value); // the regular append void append(std::string const& value); // for unpickling }; This will work for all T (e.g. int, float, etc.) ... except T == std::string. This leads me to find it unfortunate that append() is re-used for unpickling. How about: If the object has a (say) __unpickle_append__ method this is used by the unpickler instead of append or extend. Ralf import pickle class int_array(object): def __init__(self, elems): self.elems = list(elems) def __reduce__(self): return (int_array_factory, (len(self.elems),), None, int_array_iter(self.elems)) def append(self, value): values = [int(x) for x in value.split(",")] self.elems.extend(values) class int_array_iter(object): def __init__(self, elems, buf_size=4): self.elems = elems self.buf_size = 4 self.i = 0 def __iter__(self): return self def next(self): if (self.i >= len(self.elems)): raise StopIteration result = "" for i in xrange(self.buf_size): result+= str(self.elems[self.i]) + "," self.i += 1 if (self.i == len(self.elems)): break return result[:-1] def int_array_factory(size): print "reserve:", size return int_array([]) f = int_array(range(11)) print "f.elems:", f.elems s = pickle.dumps(f) print s g = pickle.loads(s) print "g.elems:", g.elems __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - forms, calculators, tips, more http://taxes.yahoo.com/
--- "Ralf W. Grosse-Kunstleve"
This leads me to find it unfortunate that append() is re-used for unpickling. How about:
If the object has a (say) __unpickle_append__ method this is used by the unpickler instead of append or extend.
I agree with you here, but I'll take what I can get. I would vote for __extend__(x) if the ballot came up and I was permitted to vote. __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - forms, calculators, tips, more http://taxes.yahoo.com/
[Ralf W. Grosse-Kunstleve]
... My little prototype below works with Python 2.3a2!
This is almost perfect. In C++ we can have two overloads. Highly simplified:
template <typename T> class array<T> { void append(T const& value); // the regular append void append(std::string const& value); // for unpickling };
This will work for all T (e.g. int, float, etc.) ... except T == std::string.
This leads me to find it unfortunate that append() is re-used for unpickling.
append() has always been used for unpickling (well, since pickle came into existence; "always" is an overstatement <wink>).
How about:
If the object has a (say) __unpickle_append__ method this is used by the unpickler instead of append or extend.
This is the implementation of the APPEND opcode (from pickle.py): def load_append(self): stack = self.stack value = stack.pop() list = stack[-1] list.append(value) It's called once per list element, and clogging it up with hasattr()/getattr() calls would greatly increase its cost (as is, it does very little, and especially not in cPickle where all those now-usually-trivial operations go at C speed). So it's unlikely you're going to get a change in what the proto 0 APPEND (which calls append()) and proto 1 APPENDS (which calls extend()) opcodes do. Adding brand new opcodes is still possible, but I doubt it's possible for Guido or me to do the work.
... f = int_array(range(11)) print "f.elems:", f.elems s = pickle.dumps(f)
Note that this is creating a text-mode ("protocol 0") pickle, which is less efficient than proto 1, which in turn is less effiecient than proto 2. That's your choice, just want to be sure you're aware you're making a choice. For backward compatibility, proto 0 has to remain the default.
--- Tim Peters
So it's unlikely you're going to get a change in what the proto 0 APPEND (which calls append()) and proto 1 APPENDS (which calls extend()) opcodes do. Adding brand new opcodes is still possible, but I doubt it's possible for Guido or me to do the work.
Sounds like a chance for plan B: a thin wrapper around a Python string. Called "pickle_string" in the revised prototype below. Ruling out the possibility that users want to create C++ arrays of Python pickle_strings this will work for all types. Creating a trivial type like pickle_string is a snag with Boost.Python. But I am guessing that the Numarray people must hate it, unless they have another simple work-around. I believe the brand new opcode is the better long-term solution for the community. I'd be willing to invest a little time if others are also interested and willing to help. Ralf import pickle class pickle_string(object): def __init__(self, size): self.size = size # here we allocate the string self.string = "" # this will be a reference to the Python string object # the C++ implementation will write directly into the string contained here def __getstate__(self): return self.string # return a reference def __setstate__(self, state): self.string = state # copy a reference class int_array(object): def __init__(self, elems): self.elems = list(elems) def __reduce__(self): return (int_array_factory, (len(self.elems),), None, int_array_iter(self.elems)) def append(self, value): if (type(value) == type(pickle_string(0))): values = [int(x) for x in value.string.split(",")] self.elems.extend(values) else: self.elems.append(value) class int_array_iter(object): def __init__(self, elems, buf_size=4): self.elems = elems self.buf_size = 4 self.i = 0 def __iter__(self): return self def next(self): if (self.i >= len(self.elems)): raise StopIteration result = pickle_string(123) for i in xrange(self.buf_size): result.string += str(self.elems[self.i]) + "," self.i += 1 if (self.i == len(self.elems)): break result.string = result.string[:-1] return result def int_array_factory(size): print "reserve:", size return int_array([]) f = int_array(range(11)) f.append(13) print "f.elems:", f.elems s = pickle.dumps(f) print s g = pickle.loads(s) print "g.elems:", g.elems __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - forms, calculators, tips, more http://taxes.yahoo.com/
This is question is related to PEP 307, "Extensions to the pickle protocol", http://www.python.org/peps/pep-0307.html .
Apparently the new Pickle "protocol 2" provides a mechanism for avoiding large temporaries, but only for lists and dicts (section "Pickling of large lists and dicts" near the end). I am wondering if the new protocol could also help us to eliminate large temporaries when pickling Boost.Python extension classes.
We wrote an open source C++ array library with Boost.Python bindings. For pickling we use the __getstate__, __setstate__ protocol. As it stands pickling involves converting the arrays to Python strings, similar to what is done in Numpy. There are two mechanisms:
1. "single buffered":
For numeric types (int, long, double, etc.) a Python string is allocated based on an upper estimate for the required size (PyString_FromStringAndSize). The entire numeric array is converted directly to that string. Finally the Python string is resized (_PyString_Resize). With this mechanism there are 2 copies of the array in memory: - the original array and - the Python string.
2. "double buffered":
For some user-defined element types it is very difficult to estimate an upper limit for the size of the string representation. Therefore the array is first converted to a dynamically growing C++ std::string, which is then copied to a Python string. With this mechanism there are 3 copies of the array in memory: - the original array, - the std::string, and - the Python string.
For very large arrays the memory overhead can be a limiting factor. Could the new protocol 2 help us in some way?
Probably, if you can switch from __getstate__ to __reduce__. __reduce__ can return a tuple of up to 5 items now; the last two are iterators (one for list-ish types, one for dict-ish types). If you return an iterator that iterates over all the pieces of your array, the array will be reconstituted at the other end using repeated calls to obj.extend() or obj.append(). There's no need to derive from list for this to work, all you need is methods extend() and append(). --Guido van Rossum (home page: http://www.python.org/~guido/)
participants (4)
-
Guido van Rossum
-
Ralf W. Grosse-Kunstleve
-
Scott Gilbert
-
Tim Peters