try vs. has_key()
Christian Tismer
tismer at appliedbiometrics.com
Fri Apr 23 09:16:25 EDT 1999
Aahz Maruch wrote:
>
> I've seen roughly half the people here doing
>
> try:
> dict[key].append(foo)
> except:
> dict[key]=[foo]
>
> with the other half doing
>
> if dict.has_key(key):
> dict[key].append(foo)
> else:
> dict[key]=[foo]
>
> Can people explain their preferences?
Well, besides the other good answers, there's some more.
>From a design point of view, both versions have their place.
The has_key version reflects the expectation of many missing keys,
and it is not exceptional. The try..except version suggests
that you usually will find a key, while a non-existant key
is exceptional.
I think, the dict.get variant is just a more efficient
compromize which is somewhere between.
It is also a matter of taste.
But on execution speed, there are the measures of Barry's paper,
and he is still right, although I would need measures for dict.get
as well.
Note that a loop running over dict.get calls is still a bit
slower than indexing the dict, due to some internal method
access overhead. It even applies when the dict.get method
is assigned to a local variable. Indexing is the fastest
way to access a dict element when you expect no key errors.
I have no exact measures yet, but will provide some, soon.
To optimize things if you can expect a very small number of
key failures, a different approach can help to get the
last cycles squeezed out of the code:
If it is possible for your algorithm, you can put the
try..except clause outside of some loop.
This saves a statement and the exception clause setup as well.
Your code must be changed quite heavily, since it has to
restart the loop where it failed, but there are a number of
cases where I need this kind of optimization.
Here an example, which doesn't try to look well-designed,
but run as fats as possible. I recommend to provide
a second, more readable version of the function for
documentation.
Assuming you have code which runs in a loop and tries to collect
new elements as your code does, here a good, "nearly fast"
version, one with "get", and a faster, ugly one:
def count_many(dict, itemlist):
for key, value in itemlist:
if dict.has_key(key):
dict[key].append(value)
else:
dict[key] = [value]
def count_many_get(dict, itemlist):
for key, value in itemlist:
lis = dict.get(key)
if lis :
lis.append(value)
else:
dict[key] = [value]
def count_many_fast(dict, itemlist):
while 1:
try:
k = 1
for key, value in itemlist:
dict[key].append(value) ; k = k+1
break
except KeyError:
dict[key] = [value]
itemlist = itemlist[k:]
results = """
# given a list of 1000 tuples, and appending
# this 1000 times, we measure
>>> d={} ; print timing(count_many, (d, list) ,1000)[0]
17.8
>>> d={} ; print timing(count_many_get, (d, list) ,1000)[0]
16.15
>>> d={} ; print timing(count_many_fast, (d, list) ,1000)[0]
13.73
>>>
"""
ciao - chris
--
Christian Tismer :^) <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home
More information about the Python-list
mailing list