# Counter for items in lists in lists?

Alex Martelli aleaxit at yahoo.com
Sat Sep 25 11:11:52 CEST 2004

```Charlotte Henkle <charlotte at fgm.com> wrote:

> Hello;
>
> I'm pondering how to count the number of times an item appears in
> total in a nested list.  For example:
>
> myList=[[a,b,c,d],[a,f,g,h],[a,b,x,y]]
>
> I'd like to know that a appeared three times, and b appeared twice,
> and the rest appeard only once.

Two problems enmeshed, here: how to count items of various values if you
have a way to iterate over the values; and how to iterate on a nested
list.  You may code them together, but factoring them as two reusable
functions is preferable -- you'll end up with more reusable building
blocks than by coding them as one monolithic function.  Python is good
at allowing such factorings.  There may be a third sub-problem, about
how best to _present_ the counts -- hard to say from your statement
about the problem, whether that's a problem, but, see later.

You're not telling us what is the type of the items.  If they're strings
or numbers, basically anything that can be a key to a dictionary, the
counting problem is best solved by building a dictionary:

def itemcount(sequence):
d = {}
for item in sequence: d[item] = 1 + d.get(item, 0)
return d

this kind of function is known as an 'accumulator': given a sequence it
loops over it computing/building some resulting object and returns it.
Not a specific Python term, but a nice one to keep in mind.

You don't clarify the structure of your list -- is it always two levels
as you present it, could it be more, are all items always going to be at
the same level or could some level have a mixture of items and sublists?
Taking the simplest hypothesis -- always two levels and all items at the
second level, the first one being made up of sublists only:

def itemseq(nested_list):
for sublist in nested_list:
for item in sublist: yield item

this kind of function is known as a 'simple generator': it produces a
sequence you may loop on with a for statement, pass to an accumulator,
etc.  This is a specific Python term, and the keyword that lets you tell
whether a function is a generator is 'yield'.

So, given these two functions -- or enhanced equivalents if your list or
its items don't meet these hypotheses -- the dict of counts is easy to
obtain:

myCounts = itemcount(itemseq(myList))

Now for the presentation issue -- you say you want to receive the
information "that a appeared three times, and b appeared twice,
and the rest appeard only once".  This appears to mean that you want to
get a sort of items by count, with items appearing only once just
summarized as a group, not listed individually.  If I correctly divined

def summarize_counts(counts_dict):
number_of_singletons = 0
count_per_item_list = []
for item, count in counts_dict.iteritems():
if count == 1: number_of_singletons += 1
else: count_per_item_list.append((count, item))
count_per_item_list.sort()
for count, item in count_per_item_list:
print 'Item %r appeared %d times' % (item, count)
print '%d items appeared only once each' % number_of_singletons

summarize_counts(itemcount(itemseq(myList)))

should do exactly what you required.  However, mindreading is always an
iffy business (that's part of why the Zen of Python says "explicit is
better than implicit"...:-).  So, if you can clarify the various
ambiguous and uncertain details of your specs, which forced me to guess
(violating another Zen injunction: "in the face of ambiguity, resist the
temptation to guess"), over and over, at what exactly you meant, I'm
confident that the various pieces presented here can be enhanced to meet