determining which value is the first to appear five times in a list?

Wolodja Wentland wentland at
Sat Feb 6 20:09:01 CET 2010

On Sat, Feb 06, 2010 at 13:24 -0500, Chris Colbert wrote:
> [(match_val_291, identifier_b), (match_val_23, identifier_b), (match_val_22,
> identifer_k) .... ]
> Now, what I would like to do is step through this list and find the identifier
> which appears first a K number of times.

I think you can use the itertools.groupby(L, lambda el: el[1]) to group
elements in your *sorted* list L by the value el[1] (i.e. the
identifier) and then iterate through these groups until you find the
desired number of instances grouped by the same identifier.

Let me exemplify this:

>>> from itertools import groupby
>>> instances = [(1, 'b'), (2, 'b'), (3, 'a'), (4, 'c'), (5, 'c'), (6, 'c'), (7, 'd')]
>>> k = 3
>>> grouped_by_identifier = groupby(instances, lambda el: el[1])
>>> grouped_by_identifier = ((identifier, list(group)) for identifier, group in grouped_by_identifier)
>>> k_instances = (group for identifier, group in grouped_by_identifier if len(group) == k)
>>> next(k_instances)
[(4, 'c'), (5, 'c'), (6, 'c')]
>>> next(k_instances)
Traceback (most recent call last):
  File "<input>", line 1, in <module>

There are certainly millions of ways to do this and most of them will be
better than my proposal here, but you might like this approach. Another
approach would use itertools.takewhile() or itertools.ifilter() ... Just
have a look :-)

yours sincerely
  .''`.     Wolodja Wentland    <wentland at> 
 : :'  :    
 `. `'`     4096R/CAF14EFC 
   `-       081C B7CD FF04 2BA9 94EA  36B2 8B7F 7D30 CAF1 4EFC
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <>

More information about the Python-list mailing list