Find duplicates in a list and count them ...
Albert Hopkins
marduk at letterboxes.org
Thu Mar 26 15:54:22 EDT 2009
On Thu, 2009-03-26 at 12:22 -0700, Paul.Scipione at aps.com wrote:
> Hello,
>
> I'm a newbie to Python. I have a list which contains integers (about
> 80,000). I want to find a quick way to get the numbers that occur in
> the list more than once, and how many times that number is duplicated
> in the list. I've done this right now by looping through the list,
> getting a number, querying the list to find out how many times the
> number exists, then writing it to a new list. On this many records it
> takes a couple of minutes. What I am looking for is something in
> python that can grab this info without looping through a list.
>
Why not build a histogram?
$ cat test.py
from random import randint
l = list()
for i in xrange(80000):
l.append(randint(0,10))
hist = dict()
for i in l:
hist[i] = hist.get(i, 0) + 1
for i in range(10):
print "%s: %s" % (i, hist.get(i, 0))
$ time python test.py
0: 7275
1: 7339
2: 7303
3: 7348
4: 7206
5: 7323
6: 7230
7: 7348
8: 7166
9: 7180
real 0m0.533s
user 0m0.518s
sys 0m0.011s
More information about the Python-list
mailing list