[issue1792] o(n*n) marshal.dumps performance for largish objects with patch

Aaron Watters report at bugs.python.org
Fri Jan 11 15:21:26 CET 2008


New submission from Aaron Watters:

Much to my surprise I found that one of
my applications seemed to be running slow as a result
of marshal.dumps.  I think the culprit is the w_more(...)
function, which grows the marshal buffer in 1k units.
This means that a marshal of size 100k will have 100
reallocations and string copies.  Other parts of Python
(and java etc) have a proportional reallocation strategy
which reallocates a new size based on the existing size.
This mean a 100k marshal requires just 5 or so
reallocations and string copies (n log n versus n**2
asymptotic performance).

I humbly submit the following patch (based on python 2.6a0
source).  I think it is a strict improvement on the existing
code, but I've been wrong before (twice ;)).
  -- Aaron Watters

----------
components: Interpreter Core
files: marshal.diff
messages: 59710
nosy: aaron_watters
severity: normal
status: open
title: o(n*n) marshal.dumps performance for largish objects with patch
type: resource usage
versions: Python 2.6
Added file: http://bugs.python.org/file9122/marshal.diff

__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue1792>
__________________________________


More information about the Python-bugs-list mailing list