Strategy for determing difference between 2 very large dictionaries

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Wed Dec 24 08:46:04 CET 2008


En Wed, 24 Dec 2008 05:16:36 -0200, <python at bdurham.com> escribió:

> I'm looking for suggestions on the best ('Pythonic') way to
> determine the difference between 2 very large dictionaries
> containing simple key/value pairs.
> By difference, I mean a list of keys that are present in the
> first dictionary, but not the second. And vice versa. And a list
> of keys in common between the 2 dictionaries whose values are
> different.
> The 2 strategies I'm considering are:
> 1. Brute force: Iterate through first dictionary's keys and
> determine which keys it has that are missing from the second
> dictionary. If keys match, then verify that the 2 dictionaries
> have identical values for the same key. Repeat this process for
> the second dictionary.
> 2. Use sets: Create sets from each dictionary's list of keys and
> use Python's set methods to generate a list of keys present in
> one dictionary but not the other (for both dictionaries) as well
> as a set of keys the 2 dictionaries have in common.

I cannot think of any advantage of the first approach - so I'd use sets.

k1 = set(dict1.iterkeys())
k2 = set(dict2.iterkeys())
k1 - k2 # keys in dict1 not in dict2
k2 - k1 # keys in dict2 not in dict1
k1 & k2 # keys in both

> Using the set
> of keys in common, compare values across dictionaries to
> determine which keys have different values (can this last step be
> done via a simple list comprehension?)

Yes; but isn't a dict comprehension more adequate?

[key: (dict1[key], dict2[key]) for key in common_keys if  
dict1[key]!=dict2[key]}

(where common_keys=k1&k2 as above)

-- 
Gabriel Genellina




More information about the Python-list mailing list