[Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol)

Antoine Pitrou solipsis at pitrou.net
Tue Jan 28 12:11:43 CET 2014


On Tue, 28 Jan 2014 11:22:40 +0100
Victor Stinner <victor.stinner at gmail.com> wrote:
> 2014-01-28 "Martin v. Löwis" <martin at v.loewis.de>:
> > Debugging reveals that it is actually the many integer objects which
> > trigger the sharing code. So a much simplified example of Victor's
> > benchmarking code can use
> >
> > data = [0]*10000000
> >
> > The difference between version 2 and version 3 here is that v2 marshals
> > a lot of "0" integers, whereas version 3 marshals a single one, and then
> > a lot of references to this integer.
> 
> Since the output size looks to be the same, it may be interesting to
> special-case small integers, or even integers and floats in general.
> Handling references to these numbers takes probably more CPU, whereas
> the gain on the file size is probably minor.

Please remember file size is only one factor. Another factor is runtime
size after unmarshalling.

For the typical case of pyc files, dump times are not very important.
Load times are.

Regards

Antoine.




More information about the Python-Dev mailing list