[issue13503] improved efficiency of bytearray pickling by using bytes type instead of str

Mon May 6 03:15:22 CEST 2013

Raymond Hettinger added the comment:

Reopening this one because there is a size issue, not just speed.

My clients are bumping into this issue repeatedly.  There is a reasonable expectation that pickling a bytearray will result in a pickle about the same size as the bytearray (not a 50% to 100% expansion depending on the content).  Likewise, the size shouldn't double when switching from protocol 0 to the presumably more efficient protocol 2:

    >>> # Example using Python 2.7.4 on Mac OS X 10.8
    >>> from pickle import dumps
    >>> print len(dumps(bytearray([200] * 10000), 0))
    10055
    >>> print len(dumps(bytearray([200] * 10000), 2))
    20052
    >>> print len(dumps(bytearray([100] * 10000), 2))
    10052
    >>> print len(dumps(bytearray([100, 200] * 5000), 2))
    15052

An attractive feature of bytearrays are their compact representation of data.  An attractive feature of the binary pickle protocol is improved compactness and speed.  Currently, it isn't living up to expectations.

----------
assignee:  -> rhettinger
nosy: +rhettinger
status: closed -> open
type: performance -> behavior
versions: +Python 2.7 -Python 3.3

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue13503>
_______________________________________