# optimizing large dictionaries

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Fri Jan 16 00:43:20 CET 2009

```On Thu, 15 Jan 2009 14:49:29 -0800, bearophileHUGS wrote:

> Matimus, your suggestions are all good.
>
> Try-except is slower than:
> if x in adict: ... else: ...

Not according to my tests.

>>> def tryexcept(D, key):
...     try:
...             return D[key]
...     except KeyError:
...             return None
...
>>> def ifinelse(D, key):
...     if key in D:
...             return D[key]
...     else:
...             return None
...
>>> from timeit import Timer
>>> setup = "from __main__ import tryexcept, ifinelse; D = {2: 'c'}"
>>>
>>> t1 = Timer('tryexcept(D, 2)', setup)
>>> t2 = Timer('ifinelse(D, 2)', setup)
>>>
>>> min(t1.repeat())  # try...except
0.61699414253234863
>>> min(t2.repeat())  # if...else
0.71856999397277832

Perhaps what you meant to say is that try...except is slower *if the key
is not found*. And in that case, there's a truly massive difference:

>>> t3 = Timer('tryexcept(D, 5)', setup)
>>> t4 = Timer('ifinelse(D, 5)', setup)
>>>
>>> min(t3.repeat())  # try...except
5.3846139907836914
>>> min(t4.repeat())  # if...else
0.5983281135559082

Which solution is faster on average will depend on the ratio of keys
found versus keys not found. On my PC, based on the above results,

But if you can arrange matters so that nearly all keys are found, then
try...except can be faster.

There's another solution nobody has mentioned (as far as I can see): the
get method. In my test I call get() from inside a function, rather than
directly, so that the timing results include the function call overhead
for all three strategies.

>>> def get(D, key):
...     return D.get(key)
...
>>>
>>> setup = "from __main__ import get; D = {2: 'c'}"
>>> t5 = Timer('get(D, 2)', setup)
>>> t6 = Timer('get(D, 5)', setup)
>>>
>>> min(t5.repeat())  # found key
0.88362312316894531
>>> min(t6.repeat())  # missing key
0.87782382965087891

--
Steven

```