marshal.dumps quadratic growth and marshal.dump not allowing file-like objects

John Machin sjmachin at
Sun Jun 15 12:16:51 CEST 2008

On Jun 15, 7:47 pm, Peter Otten <__pete... at> wrote:
> bkus... at wrote:
> > I'm stuck on a problem where I want to use marshal for serialization
> > (yes, yes, I know (c)Pickle is normally recommended here). I favor
> > marshal for speed for the types of data I use.
> > However it seems that marshal.dumps() for large objects has a
> > quadratic performance issue which I'm assuming is that it grows its
> > memory buffer in constant increments. This causes a nasty slowdown for
> > marshaling large objects. I thought I would get around this by passing
> > a cStringIO.StringIO object to marshal.dump() instead but I quickly
> > learned this is not supported (only true file objects are supported).
> > Any ideas about how to get around the marshal quadratic issue? Any
> > hope for a fix for that on the horizon? Thanks for any information.
> Here's how marshal resizes the string:
>         newsize = size + size + 1024;
>         if (newsize > 32*1024*1024) {
>                 newsize = size + 1024*1024;
>         }
> Maybe you can split your large objects and marshal multiple objects to keep
> the size below the 32MB limit.

But that change went into the svn trunk on 11-May-2008; perhaps the OP
is using a production release which would have the previous version,
which is merely "newsize = size + 1024;".

Do people really generate 32MB pyc files, or is stopping doubling at
32MB just a safety valve in case someone/something runs amok?


More information about the Python-list mailing list