[issue20416] Marshal: special case int and float, don't use references

STINNER Victor report at bugs.python.org
Tue Jan 28 12:34:43 CET 2014


STINNER Victor added the comment:

> Did you tested for numerous shared int and floats? [1000] * 1000000 and [1000.0] * 1000000? AFAIK this was important use cases for adding 3 or 4 versions.

Here are new benchmarks on Python 3.4 with:

    Integers: [1000] * 1000000
    Floats: [1000.0] * 1000000

Integers, without the patch:

    dumps v3: 62.8 ms
    data size v3: 4882.8 kB
    loads v3: 10.7 ms

Integers, with the patch:

    dumps v3: 18.6 ms (-70%)
    data size v3: 4882.8 kB (same size)
    loads v3: 27.7 ms (+158%)

Floats, without the patch:

    dumps v3: 62.5 ms
    data size v3: 4882.8 kB
    loads v3: 11.0 ms

Floats, with the patch:

    dumps v3: 29.3 ms (-53%)
    data size v3: 8789.1 kB (+80%)
    loads v3: 25.5 ms (+132%)

The version 3 was added by:
---
changeset:   82816:01372117a5b4
user:        Kristján Valur Jónsson <sweskman at gmail.com>
date:        Tue Mar 19 18:02:10 2013 -0700
files:       Doc/library/marshal.rst Include/marshal.h Lib/test/test_marshal.py Misc/NEWS Python/marshal.c
description:
Issue #16475: Support object instancing, recursion and interned strings in marshal
---

This issue tells about "sharing string constants, common tuples, even common code objects", not sharing numbers.

For real data, here are interesting numbers:
http://bugs.python.org/issue16475#msg176013

Integers only represent 4.8% of serialized data, and only 8.2% of these integers can be shared. (Floats represent 0.29%.) Whereas strings repsent 58% and 57% can be shared.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue20416>
_______________________________________


More information about the Python-bugs-list mailing list