determining which value is the first to appear five times in a list?

Wolodja Wentland wentland at cl.uni-heidelberg.de
Sat Feb 6 15:25:40 EST 2010


On Sat, Feb 06, 2010 at 14:42 -0500, Terry Reedy wrote:
> On 2/6/2010 2:09 PM, Wolodja Wentland wrote:

> >I think you can use the itertools.groupby(L, lambda el: el[1]) to group
> >elements in your *sorted* list L by the value el[1] (i.e. the
> >identifier) and then iterate through these groups until you find the
> >desired number of instances grouped by the same identifier.

> This will generally not return the same result. It depends on
> whether OP wants *any* item appearing at least 5 times or whether
> the order is significant and the OP literally wants the first.

Order is preserved by itertools.groupby - Have a look:

>>> instances = [(1, 'b'), (2, 'b'), (3, 'a'), (4, 'c'), (5, 'c'), (6, 'c'), (7, 'b'), (8, 'b')]
>>> grouped_by_identifier = groupby(instances, lambda el: el[1])
>>> grouped_by_identifier = ((identifier, list(group)) for identifier, group in grouped_by_identifier)
>>> k_instances = (group for identifier, group in grouped_by_identifier if len(group) == 2)
>>> for group in k_instances:
...     print group
... 
[(1, 'b'), (2, 'b')]
[(7, 'b'), (8, 'b')]

So the first element yielded by the k_instances generator will be the
first group of elements from the original list whose identifier appears
exactly k times in a row. 

> Sorting the entire list may also take a *lot* longer.
Than what? 

Am I missing something? Is the "*sorted*" the culprit? If yes -> Just
forget it as it is not relevant.
-- 
  .''`.     Wolodja Wentland    <wentland at cl.uni-heidelberg.de> 
 : :'  :    
 `. `'`     4096R/CAF14EFC 
   `-       081C B7CD FF04 2BA9 94EA  36B2 8B7F 7D30 CAF1 4EFC
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://mail.python.org/pipermail/python-list/attachments/20100206/95acf6a8/attachment.sig>


More information about the Python-list mailing list