Performance: sets vs dicts.

Peter Otten __peter__ at web.de
Mon Aug 30 02:36:06 EDT 2010


Seth Rees wrote:

> On 08/29/10 14:43, Peter Otten wrote:
>> John Nagle wrote:
>>
>>>      Is the "in" test faster for a dict or a set?
>>> Is "frozenset" faster than "set"?  Use case is
>>> for things like applying "in" on a list of 500 or so words
>>> while checking a large body of text.
>>
>> As Arnaud suspects: no significant difference:
>>
>> $ python dictperf.py
>> dict -->  0.210289001465
>> set -->  0.202902793884
>> frozenset -->  0.198950052261
>>
>> $ cat dictperf.py
>> import random
>> import timeit
>>
>> with open("/usr/share/dict/words") as instream:
>>      words = [line.strip() for line in instream]
>>
>> #random.seed(42)
>> sample = random.sample(words, 501)
>>
>> n = sample.pop()
>> y = random.choice(sample)
>>
>> d = dict.fromkeys(sample)
>> s = set(sample)
>> f = frozenset(sample)
>>
>>
>> for lookup in d, s, f:
>>      print type(lookup).__name__, "-->", timeit.timeit(
>>          "n in lookup; y in lookup",
>>          "from __main__ import lookup, n, y")
>>
>> Peter
> What about lists versus tuples?

You trade O(1) for O(N) with both, a bad idea for all but the smallest 
lists/tuples.

$ python dictperf.py
dict --> 0.221934080124
set --> 0.220952987671
frozenset --> 0.225043058395
list --> 21.5041530132
tuple --> 20.8655071259

By the way, the script will run on your computer, too ;)

Peter




More information about the Python-list mailing list