Large list in memory slows Python
bthiell at cfa.harvard.edu
Wed Jan 4 09:57:56 EST 2012
On Tue, Jan 3, 2012 at 5:59 PM, Peter Otten <__peter__ at web.de> wrote:
> Benoit Thiell wrote:
>> I am experiencing a puzzling problem with both Python 2.4 and Python
>> 2.6 on CentOS 5. I'm looking for an explanation of the problem and
>> possible solutions. Here is what I did:
>> Python 2.4.3 (#1, Sep 21 2011, 19:55:41)
>> IPython 0.8.4 -- An enhanced Interactive Python.
>> In : def test():
>> ...: return [(i,) for i in range(10**6)]
>> In : %time x = test()
>> CPU times: user 0.82 s, sys: 0.04 s, total: 0.86 s
>> Wall time: 0.86 s
>> In : big_list = range(50 * 10**6)
>> In : %time y = test()
>> CPU times: user 9.11 s, sys: 0.03 s, total: 9.14 s
>> Wall time: 9.15 s
>> As you can see, after creating a list of 50 million integers, creating
>> the same list of 1 million tuples takes about 10 times longer than the
>> first time.
>> I ran these tests on a machine with 144GB of memory and it is not
>> swapping. Before creating the big list of integers, IPython used 111MB
>> of memory; After the creation, it used 1664MB of memory.
> In older Pythons the heuristic used to decide when to run the cyclic garbage
> collection is not well suited for the creation of many objects in a row.
> Try switching it off temporarily with
> import gc
> # create many objects that are here to stay
> You may also encorporate that into your test function:
> def test():
> return [...]
Thanks Peter, this is very helpful. Modifying my test according to
your directions produced much more consistent results.
The SAO/NASA Astrophysics Data System
More information about the Python-list