marshal.dumps quadratic growth and marshal.dump not allowing file-like objects

Peter Otten __peter__ at web.de
Sun Jun 15 11:47:03 CEST 2008


bkustel at gmail.com wrote:

> I'm stuck on a problem where I want to use marshal for serialization
> (yes, yes, I know (c)Pickle is normally recommended here). I favor
> marshal for speed for the types of data I use.
> 
> However it seems that marshal.dumps() for large objects has a
> quadratic performance issue which I'm assuming is that it grows its
> memory buffer in constant increments. This causes a nasty slowdown for
> marshaling large objects. I thought I would get around this by passing
> a cStringIO.StringIO object to marshal.dump() instead but I quickly
> learned this is not supported (only true file objects are supported).
> 
> Any ideas about how to get around the marshal quadratic issue? Any
> hope for a fix for that on the horizon? Thanks for any information.

Here's how marshal resizes the string:

        newsize = size + size + 1024;
        if (newsize > 32*1024*1024) {
                newsize = size + 1024*1024;
        }

Maybe you can split your large objects and marshal multiple objects to keep
the size below the 32MB limit.

Peter



More information about the Python-list mailing list