dictionary help
Simon Forman
sajmikins at gmail.com
Tue Aug 11 12:10:10 EDT 2009
On Aug 11, 11:51 am, MRAB <pyt... at mrabarnett.plus.com> wrote:
> Krishna Pacifici wrote:
> > Thanks for the help.
>
> > Actually this is part of a much larger project, but I have unfortunately
> > pigeon-holed myself into needing to do these things without a whole lot
> > of flexibility.
>
> > To give a specific example I have the following dictionary where I need
> > to remove values that are duplicated with other values and remove values
> > that are duplicates of the keys, but still retain it as a dictionary.
> > Each value is itself a class with many attributes that I need to call
> > later on in the program, but I cannot have duplicates because it would
> > mess up some estimation part of my model.
>
> > d =
> > {36: [35, 37, 26, 46], 75: [74, 76, 65, 85], 21: [20, 22, 11, 31], 22:
> > [21, 23, 12, 32], 26: [25, 27, 16, 36], 30: [20, 31, 40]}
>
> > So I want a new dictionary that would get rid of the duplicate values of
> > 21, 22, 36 and 20 and give me back a dictionary that looked like this:
>
> > new_d=
> > {36: [35, 37, 26, 46], 75: [74, 76, 65, 85], 21: [20, 11, 31], 22: [23,
> > 12, 32], 26: [25, 27, 16], 30: [40]}
>
> > I understand that a dictionary may not be the best approach, but like I
> > said I have sort of pigeon-holed myself by the way that I am simulating
> > my data and the estimation model that I am using. Any suggestions or
> > comments about the above problem would be greatly appreciated.
>
> >>> d = {36: [35, 37, 26, 46], 75: [74, 76, 65, 85], 21: [20, 22, 11,
> 31], 22: [21, 23, 12, 32], 26: [25, 27, 16, 36], 30: [20, 31, 40]}
> >>> new_d = {}
> >>> seen = set(d.keys())
> >>> for k, v in d.items():
> ... new_d[k] = [x for x in v if x not in seen]
> ... seen |= set(new_d[k])
> ...
> >>> new_d
> {36: [35, 37, 46], 75: [74, 76, 65, 85], 21: [20, 11, 31], 22: [23, 12,
> 32], 26: [25, 27, 16], 30: [40]}
Ha ha, MRAB beat me to it:
d = {
36: [35, 37, 26, 46],
75: [74, 76, 65, 85],
21: [20, 22, 11, 31],
22: [21, 23, 12, 32],
26: [25, 27, 16, 36],
30: [20, 31, 40],
}
new_d = { # Given, and apparently incorrect.
36: [35, 37, 26, 46], # 26 is a key and should be gone.
75: [74, 76, 65, 85],
21: [20, 11, 31],
22: [23, 12, 32],
26: [25, 27, 16],
30: [40],
}
expected = {
36: [35, 37, 46],
75: [74, 76, 65, 85],
21: [20, 11, 31],
22: [23, 12, 32],
26: [25, 27, 16],
30: [40],
}
def removeDuplicates(D):
'''
Remove values that are duplicated with other values
and remove values that are duplicates of the keys.
Assumes that values in the lists are already unique within
each list. I.e. duplicates are only in the keys or in other
lists.
This function works "in place" on D, so it doesn't return
anything. Caller must keep a reference to D.
'''
seen = set(D) # Get a set of the keys.
for key, values_list in D.iteritems():
# Filter out values that have already been seen.
filtered_values = [
value
for value in values_list
if not value in seen
]
# Remember newly seen values.
seen.update(filtered_values)
D[key] = filtered_values
## Example:
##
## >>> d == expected
## False
## >>> removeDuplicates(d)
## >>> d == expected
## True
More information about the Python-list
mailing list