[New-bugs-announce] [issue19256] Optimize marshal format and add version token.

Kristján Valur Jónsson report at bugs.python.org
Mon Oct 14 00:30:08 CEST 2013

New submission from Kristján Valur Jónsson:

Issue $19219 added new tokens making marshal format smaller and faster.
This patch adds two new tokens:
TYPE_SHORT_REF for which the ref index is a byte and
TYPE_VERSION for which the operand is the protocol version.

The former helps because it catches common singletons such as 0, 1, () and so on which typically show up early in a pickle.  they then need only two bytes to encoded.
This shrinks the code for the decimal.py module from 172K to 162K.

The second can help break backwards compatibility requirements in the future.  The format (if 4 or larger) is now put into the stream, so that future new formats can re-assign opcodes if needed.

I don't reassign the version number, leaving it at the new value of 4.  This change is still backwards compatible with the previous '4' so there should be no problem.

For size / performance comparison, try:
python.exe -m timeit -s "import decimal; c=compile(open(decimal.__file__).read(), decimal.__file__, 'exec'); import marshal; d=marshal.dumps(c); print(len(d))" "marshal.loads(d)"

files: marshal.patch
keywords: patch
messages: 199818
nosy: haypo, kristjan.jonsson, pitrou
priority: normal
severity: normal
status: open
title: Optimize marshal format and add version token.
type: enhancement
versions: Python 3.4
Added file: http://bugs.python.org/file32102/marshal.patch

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list