unique values of a Dictionary list (removing duplicate elements of a list)
Peter Otten
__peter__ at web.de
Fri May 21 09:46:55 EDT 2010
Chad Kellerman wrote:
> On Fri, May 21, 2010 at 8:07 AM, Chad Kellerman <sunckell at gmail.com>
> wrote:
>
>>
>>
>> On Fri, May 21, 2010 at 7:50 AM, Peter Otten <__peter__ at web.de> wrote:
>>
>>> Chad Kellerman wrote:
>>>
>>> > Python users,
>>> > I am parsing an AIX trace file and creating a dictionary
>>> containing
>>> > keys (PIDS) and values (a list of TIDS). With PIDS being unique
>>> > process ids
>>> > and TIDS, being a list of thread ids. My function populates the keys
>>> > so that they are unique, but my list contains duplicates.
>>> >
>>> > Can someone point me in the right direction so that my dictionary
>>> > value
>>> > does not contain duplicate elements?
>>> >
>>> >
>>> > here is what I got.
>>> >
>>> > --------------<portion of code that is relevant>------------------
>>> >
>>> > pidtids = {}
>>> >
>>> > # --- function to add pid and tid to a dictionary
>>> > def addpidtids(pid,tid):
>>> > pidtids.setdefault(pid,[]).append(tid)
>>>
>>> Use a set instead of a list (and maybe a defaultdict):
>>>
>>> from collections import defaultdict
>>>
>>> pidtids = defaultdict(set)
>>>
>>> def addpidtids(pid, tid):
>>> pidtids[pid].add(tid)
>>>
>>> Peter
>>>
>>
>> Thanks. I guess I should have posted this in my original question.
>>
>> I'm on 2.4.3 looks like defautldict is new in 2.5.
>>
>> I'll see if I can upgrade.
>>
>> Thanks again.
>>
>
>
> instead of upgrading.. (probably be faster to use techniques in available
> 2.4.3)
>
> Couldn't I check to see if the pid exists (has_key I believe) and then
> check if the tid is a value, in the the list for that key, prior to
> passing it to the function?
>
> Or would that be too 'expensive'?
No.
pidtids = {}
def addpidtids(pid, tid):
if pid in pidtids:
pidtids[pid].add(tid)
else:
pidtids[pid] = set((tid,))
should be faster than
def addpidtids(pid, tid):
pidtids.setdefault(pid, set()).add(tid)
and both should work in python2.4.
Peter
More information about the Python-list
mailing list