[Python-Dev] Encoding variable-length integers/counts in pickle

Antoine Pitrou solipsis at pitrou.net
Tue Jul 10 06:10:08 EDT 2018


On Tue, 10 Jul 2018 02:53:47 +0100
MRAB <python at mrabarnett.plus.com> wrote:
> In the regex module I use an encoding scheme when pickling pattern 
> objects which is based on the way MIDI encodes variable-length integers, 
> and I think it might have a use in a pickle protocol.
> 
> In the basic format, an integer is split up into 7-bit chunks, each 
> chunk put into a byte, and the most-significant bit of the byte used to 
> signal whether the value continues into the following byte.
> 
> And integer must be encoded into the minimum number of bytes, so an 
> encoded sequence of bytes would never start with 0x80.

The problem with variable-length encoding is that you need more
individual reads to fetch a datum.  The whole point of pickle framing is
to replace many small reads with a few large reads, and variable-length
encoding is adversial to that goal.

Regards

Antoine.




More information about the Python-Dev mailing list