# Algorithm help per favore

Steven Taschuk staschuk at telusplanet.net
Wed Jun 18 18:41:23 CEST 2003

```Quoth Larry:
> I need to take a list (probably 20k to 40k elements) of numbers and
> remove consecutive duplicates. Non consecutive duplicates are ok.
>
> Example: [6,3,3,3,5,7,6,3,4,4,3] => [6,3,5,7,6,3,4,3]
>
> The 3 and 6 can appear more than once in the result set because
> they're separated by another value. Obviously this is trivial to
> accomplish by walking thru the list and building a new one (or yanking
> elements out of the existing one) but I'm curious if anyone knows of a
> more clever way, with speed being a goal more than memory usage.

If speed is the goal, removing elements from the existing list is
not indicated; that would take O(n**2) time, since each removal
has to shift all subsequent elements down one spot.

Building a new list seems not only like the obvious approach, but
probably the fastest.  I think it quite unlikely that there's any
faster clever algorithm.

def unique(iterable):
result = []
append = result.append       # avoid attribute lookup in loop
next = iter(iterable).next   # ditto
try:
previous = next()
append(previous)
while True:
value = next()
if value != previous:
append(value)
previous = value
except StopIteration:
pass
return result

(If latency is important, this could be changed into a generator
easily; that would also decrease memory usage.  I'm not sure what
it would do to throughput -- try it and see.)

I don't see any way to use built-ins or whatnot to move the loop
into C.  Of course, you could recode the whole thing in C.

--
Steven Taschuk             "[W]e must be very careful when we give advice
staschuk at telusplanet.net    to younger people: sometimes they follow it!"
-- "The Humble Programmer", Edsger Dijkstra

```