Strategy for determing difference between 2 very large dictionaries
python at bdurham.com
python at bdurham.com
Wed Dec 24 09:23:00 CET 2008
Thank you very much for your feedback!
> k1 = set(dict1.iterkeys())
I noticed you suggested .iterkeys() vs. .keys(). Is there any advantage
to using an iterator vs. a list as the basis for creating a set? I
understand that an iterator makes sense if you're working with a large
set of items one at a time, but if you're creating a non-filtered
collection, I don't see the advantage of using an iterator or a list.
I'm sure I'm missing a subtle point here :)
>> can this last step be done via a simple list comprehension?
> Yes; but isn't a dict comprehension more adequate?
> [key: (dict1[key], dict2[key]) for key in common_keys if
Cool!! I'm relatively new to Python and totally missed the ability to
work with dictionary comprehensions. Yes, your dictionary comprehension
technique is much better than the list comprehension approach I was
struggling with. Your dictionary comprehension statement describes
exactly what I wanted to write.
----- Original message -----
From: "Gabriel Genellina" <gagsl-py2 at yahoo.com.ar>
To: python-list at python.org
Date: Wed, 24 Dec 2008 05:46:04 -0200
Subject: Re: Strategy for determing difference between 2 very large
En Wed, 24 Dec 2008 05:16:36 -0200, <python at bdurham.com> escribió:
> I'm looking for suggestions on the best ('Pythonic') way to
> determine the difference between 2 very large dictionaries
> containing simple key/value pairs.
> By difference, I mean a list of keys that are present in the
> first dictionary, but not the second. And vice versa. And a list
> of keys in common between the 2 dictionaries whose values are
> The 2 strategies I'm considering are:
> 1. Brute force: Iterate through first dictionary's keys and
> determine which keys it has that are missing from the second
> dictionary. If keys match, then verify that the 2 dictionaries
> have identical values for the same key. Repeat this process for
> the second dictionary.
> 2. Use sets: Create sets from each dictionary's list of keys and
> use Python's set methods to generate a list of keys present in
> one dictionary but not the other (for both dictionaries) as well
> as a set of keys the 2 dictionaries have in common.
I cannot think of any advantage of the first approach - so I'd use sets.
k1 = set(dict1.iterkeys())
k2 = set(dict2.iterkeys())
k1 - k2 # keys in dict1 not in dict2
k2 - k1 # keys in dict2 not in dict1
k1 & k2 # keys in both
> Using the set
> of keys in common, compare values across dictionaries to
> determine which keys have different values (can this last step be
> done via a simple list comprehension?)
Yes; but isn't a dict comprehension more adequate?
[key: (dict1[key], dict2[key]) for key in common_keys if
(where common_keys=k1&k2 as above)
More information about the Python-list