finding items that occur more than once in a list

John Machin sjmachin at lexicon.net
Wed Mar 19 19:06:46 EDT 2008


On Mar 20, 9:57 am, Justin Bozonier <darkxant... at gmail.com> wrote:
> On Mar 19, 2:48 pm, John Machin <sjmac... at lexicon.net> wrote:
>
>
>
> > On Mar 19, 10:08 am, sturlamolden <sturlamol... at yahoo.no> wrote:
>
> > > On 18 Mar, 23:45, Arnaud Delobelle <arno... at googlemail.com> wrote:
>
> > > > > def nonunique(lst):
> > > > >    slst = sorted(lst)
> > > > >    dups = [s[0] for s in
> > > > >         filter(lambda t : t[0] == t[1], zip(slst[:-1],slst[1:]))]
> > > > >    return [dups[0]] + [s[1] for s in
> > > > >         filter(lambda t : t[0] != t[1], zip(dups[:-1],dups[1:]))]
>
> > > > Argh!  What's wrong with something like:
>
> > > > def duplicates(l):
> > > >     i = j = object()
> > > >     for k in sorted(l):
> > > >         if i != j == k: yield k
> > > >         i, j = j, k
>
> > > Nice, and more readable. But I'd use Paul Robin's solution. It is O(N)
> > > as opposed to ours which are O(N log N).
>
> > I'd use Raymond Hettinger's solution. It is as much O(N) as Paul's,
> > and is IMHO more readable than Paul's.
>
> It's not as much O(N)... Paul Robin's uses a sort first which is
> definitely not O(N). Paul's could be prettied up a bit but the general
> principle is sound.

"""
from collections import defaultdict
def nonunique(lst):
   d = defaultdict(int)
   for x in lst: d[x] += 1
   return [x for x,n in d.iterkeys() if n > 1]
"""

I see no sort here.



More information about the Python-list mailing list