why is bytearray treated so inefficiently by pickle?

Irmen de Jong irmen at -NOSPAM-xs4all.nl
Tue Dec 6 14:44:02 EST 2011


On 06-12-11 20:27, John Ladasky wrote:
> On a related note, pickling of arrays of float64 objects, as generated
> by the numpy package for example, are wildly inefficient with memory.
> A half-million float64's requires about 4 megabytes, but the pickle
> file I generated from a numpy.ndarray of this size was 42 megabytes.
>
> I know that numpy has its own pickle protocol, and that it's supposed
> to help with this problem.  Still, if this is a general problem with
> Python and pickling numbers, it might be worth solving it in the
> language itself.

Python provides ample ways for custom types to influence the way they're 
pickled (getstate/setstate, reduce).

Are numpy's arrays are pickled similar to Python's own array types? In 
that case, when using Python 2.x, they're pickled very inefficiently 
indeed (every element is encoded with its own token). In Python 3.x, 
array pickling is very efficient because it stores the machine type 
representation in the pickle.


Irmen



More information about the Python-list mailing list