performance of pickling & large lists :-(

Martin v. Loewis martin at v.loewis.de
Tue Aug 6 15:21:28 EDT 2002


Andreas.Leitgeb at siemens.at (Andreas Leitgeb) writes:

> Is pickling more complicated internally, than I expected it to be ?

Most likely. Pickle must keep a memo of all objects pickled so far, so
that if the identically-same object is pickled again, only a reference
to the earlier object is pickled.

This memo is implemented as a dictionary, which has the id of object
as a key. You can't use the object itself as a key, since that might
not be hashable. So each pickled object requires construction of the
id object, which is an int representing the address of the object.

It turns out that pickle also allocates a tuple for each memo object
as the value. I believe that tuple could be avoided, reducing the
memory needs of pickle somewhat.

> Another problem:
>   when I create a few objects , (just enough to see memory consumption go
> up for python in "top"), and then release them ("del"), then I see memory
> consumption go down (almost) to the old size.
>   But, when I create a really large list, and del it, then the memory-
> footprint remains large. 
> 
> Is this a bug or a feature ?

Neither, nor. This is just how your C library implements malloc(3).

In some cases, releasing memory (via free(3)) will return memory to
the operating system. In other cases, this is not possible (e.g. if
that would create a hole). In such cases, the memory is just put into
a free list. In this case, top will still show it as consumed.

HTH,
Martin




More information about the Python-list mailing list