[Tutor] List Dup-Elim Method?
Mon, 05 Feb 2001 13:15:23 -0600
Thanks! This helps a lot. I've used the dictionary method before
(with great success), but the when you have lists within lists
(sometimes within lists) it takes more time to set it up the dictionary
stuff than it would to do something like this. Thanks again!
>>> "Tim Peters" <email@example.com> 02/03/2001 3:43:14 AM >>>
If the elements of a list are hashable, the method using a dict will be
far the fastest if the list is large.
If the elements are not hashable (for example, a list of lists), but
sorted, still much quicker than brute force (yet still much slower
using a temporary dict!) is this:
"""Remove all duplicates from list x, in place.
The elements of x must enjoy a total ordering.
n = len(x)
if n > 1:
last = x
i = avail = 1
while i < n:
if x[i] != last:
x[avail] = last = x[i]
avail += 1
i += 1
That requires Python 2.0, because of the "+=" thingies.
>>> x = [[1, 2], [3, 4]]
>>> x *= 10
[[1, 2], [3, 4], [1, 2], [3, 4], [1, 2], [3, 4], [1, 2], [3, 4],
[1, 2], [3, 4], [1, 2], [3, 4], [1, 2], [3, 4], [1, 2], [3, 4],
[1, 2], [3, 4], [1, 2], [3, 4]]
[[1, 2], [3, 4]]
How does it work? Sorting the list brings all the equal elements next
each other. (Note: Many sorting routines don't work very quickly
there are lots of equal elements, but Python's does. I know that
wrote Python's list.sort() <wink>.)
After all the equal elements are adjacent, it just marches across the
once, keeping track of when the list element at index i *changes*.
does, it moves that not-seen-before element toward the front of the
and moves on skipping over the chunk of elements equal to *it*.
gets rid of the list positions no longer needed (the "del" stmt).
Of course there's nothing to do if the list doesn't have at least two
elements to start, so it checks for that first. It *has* to avoid
into the main body if the list is empty, because then "last = x"
blow up. Since it has to check for that anyway, it doesn't cost any
make sure there are at least two things in the list.
This isn't an easy algorithm -- the steps are delicate. If you can use
dict method instead, do so!
what-you-can-assume-constrains-what-you-can-do-ly y'rs - tim
Tutor maillist - Tutor@python.org
Confidentiality Notice: This e-mail transmission
may contain confidential or legally privileged
information that is intended only for the individual
or entity named in the e-mail address. If you are not
the intended recipient, you are hereby notified that
any disclosure, copying, distribution, or reliance
upon the contents of this e-mail is strictly prohibited.
If you have received this e-mail transmission in error,
please reply to the sender, so that we can arrange
for proper delivery, and then please delete the message
from your inbox. Thank you.