[portland] Python dictionary performance as memory usage increases

Ryan Roser ryanroser at gmail.com
Fri Aug 5 19:42:33 CEST 2011


Hi,

I'm trying to improve the performance of some of my code.  I've noticed that
one of the bottlenecks involves making a large dictionary where the values
are lists.  Making a large dictionary is fast, repeatedly creating lists is
fast, but things slow down if I set the lists as values for the dictionary.
 Interestingly, this slowdown only occurs if there is already data in memory
in Python, and things get increasingly slow as the amount of memory used
increases.

I have a toy example demonstrating the behavior below. Do you know why this
is happening?  Is there a problem with my test?  Does Python do something
special when storing lists as values in dictionaries?  Is there a workaround
or an alternative data structure that doesn't exhibit slowdown as Python's
memory usage increases?

Thanks for the help,

Ryan



####################################
##  A test script
####################################
import time
import random

x = range(100000)
def test():
    # Creating a dictionary with an entry for each element in x
    # is fast, and so is repeatedly creating a list
    start = time.time()
    d = dict()
    for i in x:
        tmp = []
        tmp.append('something')
        d[i] = 1
    print 'dict w/o lists:', time.time() - start

    # but assigning the list to the dictionary gets very slow
    # if memory is not empty
    start = time.time()
    d = dict()
    for i in x:
        tmp = []
        tmp.append('something')
        d[i] = tmp
    print 'dict w lists:  ', time.time() - start

print 'runtimes with memory empty'
test()
print 'loading data'
data = [random.random() for i in xrange(30000000)] # ~1gb of mem
print 'runtimes with memory occupied'
test()
####################################


Results:

$ python2.4 tester.py
runtimes with memory empty
dict w/o lists: 0.0506901741028
dict w lists:   0.0766770839691
loading data
runtimes with memory occupied
dict w/o lists: 0.0391671657562
dict w lists:   2.18966984749

$ python2.6 tester.py
runtimes with memory empty
dict w/o lists: 0.0479600429535
dict w lists:   0.0784649848938
loading data
runtimes with memory occupied
dict w/o lists: 0.0361380577087
dict w lists:   2.49754095078

$ python2.7 tester.py
runtimes with memory empty
dict w/o lists: 0.0464890003204
dict w lists:   0.0735650062561
loading data
runtimes with memory occupied
dict w/o lists: 0.0356121063232
dict w lists:   2.49307012558


######## Python versions and machine info #########
Machine has 32 gb of ram, 8 cores

Python 2.4.3 (#1, Sep  3 2009, 15:37:37)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-46)] on linux2

ActivePython 2.6.5.14 (ActiveState Software Inc.) based on
Python 2.6.5 (r265:79063, Jul  5 2010, 10:31:13)
[GCC 4.0.0 20050519 (Red Hat 4.0.0-8)] on linux2

Python 2.7.1 (r271:86832, May 25 2011, 13:34:05)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2

$ uname -a
Linux research-team10 2.6.18-164.el5 #1 SMP Thu Sep 3 03:28:30 EDT 2009
x86_64 x86_64 x86_64 GNU/Linux
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/portland/attachments/20110805/95a97ef2/attachment.html>


More information about the Portland mailing list