[Tutor] Equivalent of Set in PtO

Peter Otten __peter__ at web.de
Tue Apr 26 10:10:42 CEST 2011


Becky Mcquilling wrote:

> I have a code snippet that I have used to count the duplicates in a list
> as such:
> 
> from sets import Set
> 
> def countDups(duplicateList):
>   uniqueSet = Set(item for item in duplicateList)
>   return[(item, duplicateList.count(item)) for item in uniqueSet]
> 
> 
> lst = ['word', 'word', 'new', 'new', 'new']
> print countDups(lst)
> 
> The result is: [('new', 3), ('word', 2)], which is what is expected.  This
> was using python version 2.7.  I want to do the same thing in Python 3.1,
> but I'm not sure what has replaced Set in the newer version, can someone
> give me an idea here?

Note that your countDups() function has to iterate len(set(duplicateList))+1 
times over the duplicateList, once to build the set and then implicitly in 
the count() method for every item in the set. If you use a dictionary 
instead you can find the word frequencies in a single pass:

>>> lst = ['word', 'word', 'new', 'new', 'new']
>>> freq = {}
>>> for item in lst:
...     freq[item] = freq.get(item, 0) + 1
...
>>> freq.items()
dict_items([('new', 3), ('word', 2)])

There is also a ready-to-use class that implements this efficient approach:

>>> from collections import Counter
>>> Counter(lst).items()
dict_items([('new', 3), ('word', 2)])




More information about the Tutor mailing list