determining which value is the first to appear five times in a list?
Wolodja Wentland
wentland at cl.uni-heidelberg.de
Sat Feb 6 15:25:40 EST 2010
On Sat, Feb 06, 2010 at 14:42 -0500, Terry Reedy wrote:
> On 2/6/2010 2:09 PM, Wolodja Wentland wrote:
> >I think you can use the itertools.groupby(L, lambda el: el[1]) to group
> >elements in your *sorted* list L by the value el[1] (i.e. the
> >identifier) and then iterate through these groups until you find the
> >desired number of instances grouped by the same identifier.
> This will generally not return the same result. It depends on
> whether OP wants *any* item appearing at least 5 times or whether
> the order is significant and the OP literally wants the first.
Order is preserved by itertools.groupby - Have a look:
>>> instances = [(1, 'b'), (2, 'b'), (3, 'a'), (4, 'c'), (5, 'c'), (6, 'c'), (7, 'b'), (8, 'b')]
>>> grouped_by_identifier = groupby(instances, lambda el: el[1])
>>> grouped_by_identifier = ((identifier, list(group)) for identifier, group in grouped_by_identifier)
>>> k_instances = (group for identifier, group in grouped_by_identifier if len(group) == 2)
>>> for group in k_instances:
... print group
...
[(1, 'b'), (2, 'b')]
[(7, 'b'), (8, 'b')]
So the first element yielded by the k_instances generator will be the
first group of elements from the original list whose identifier appears
exactly k times in a row.
> Sorting the entire list may also take a *lot* longer.
Than what?
Am I missing something? Is the "*sorted*" the culprit? If yes -> Just
forget it as it is not relevant.
--
.''`. Wolodja Wentland <wentland at cl.uni-heidelberg.de>
: :' :
`. `'` 4096R/CAF14EFC
`- 081C B7CD FF04 2BA9 94EA 36B2 8B7F 7D30 CAF1 4EFC
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://mail.python.org/pipermail/python-list/attachments/20100206/95acf6a8/attachment-0001.sig>
More information about the Python-list
mailing list