unique-ifying a list
Jonathan Gardner
jgardner at jonathangardner.net
Fri Aug 7 17:41:07 EDT 2009
On Aug 7, 1:53 pm, kj <no.em... at please.post> wrote:
> Suppose that x is some list. To produce a version of the list with
> duplicate elements removed one could, I suppose, do this:
>
> x = list(set(x))
>
> but I expect that this will not preserve the original order of
> elements.
>
> I suppose that I could write something like
>
> def uniquify(items):
> seen = set()
> ret = []
> for i in items:
> if not i in seen:
> ret.append(i)
> seen.add(i)
> return ret
>
> But this seems to me like such a commonly needed operation that I
> find it hard to believe one would need to resort to such self-rolled
> solutions. Isn't there some more standard (and hopefully more
> efficient, as in "C-coded"/built-in) approach?
>
Honestly, doing unique operations is pretty rare in the application
level. Unless you're writing some kind of database, I don't see why
you'd do it. (My recommendation is not to write databases.)
If you want unique elements, use a set. If you want to order them,
sort a list of the items in the set.
If you want to preserve the order, then using a dict may be even
better.
orig_order = dict(reversed([reversed(i) for i in enumerate
(items)])
unique_ordered = sorted(orig_order.keys(), key=lambda k: orig_order
[k])
Hints to understanding:
* enumerate generates (index, item) pairs.
* We reverse each pair so that we get an item -> index mapping.
* We reverse it so that the first ones appear last. Later pairs
override earlier ones in dict().
More information about the Python-list
mailing list