[Tutor] Filtering out unique list elements
Steven D'Aprano
steve at pearwood.info
Wed May 4 12:12:44 CEST 2011
Spyros Charonis wrote:
> Dear All,
>
> I have built a list with multiple occurrences of a string after some text
> processing that goes something like this:
>
> [cat, dog, cat, cat, cat, dog, dog, tree, tree, tree, bird, bird, woods,
> woods]
>
> I am wondering how to truncate this list so that I only print out the unique
> elements, i.e. the same list but with one occurrence per element:
>
> [cat, dog, tree, bird, woods]
Others have already mentioned set(), but unless I missed something,
nobody pointed out that sets are unordered, and so will lose whatever
order was in the list:
>>> # words = [cat, dog, cat, cat, cat etc...]
>>> set(words)
set(['bird', 'woods', 'tree', 'dog', 'cat'])
They also didn't mention that sets require the items to be hashable:
>>> set(['bird', {}, 'cow'])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: dict objects are unhashable
If neither of those limitations matter to you, then sets will be the
fastest and easiest solution.
Alternatively, if you only have a few elements:
unique = []
for element in items:
if element not in unique:
unique.append(element)
However this will be SLOW if you have many items.
Here are some more recipes:
http://code.activestate.com/recipes/52560-remove-duplicates-from-a-sequence/
--
Steven
More information about the Tutor
mailing list