# removing consecutive duplicates from list

Wed Apr 21 05:56:24 EDT 2021

On 21/04/2021 19.39, Peter Otten wrote:
On 21/04/2021 01:10, dn via Tutor wrote:
On 21/04/2021 05.27, Peter Otten wrote:
On 20/04/2021 18:48, Manprit Singh wrote:
>>>
Consider a list given below:
lst = [2, 3, 3, 4, 5, 5, 3, 7, 9, 9, 4]
i need to remove consecutive duplicates from list lst:
the answer must be :
>>>>
[2, 3, 4, 5, 3, 7, 9, 4]
>>>>
The code that i have written to solve it, is written below:
lst = [2, 3, 3, 4, 5, 5, 3, 7, 9, 9, 4]
ls = lst[1:]+[object()]
[x for x, y in zip(lst, ls) if x != y]
>>>>
The list comprehension gives the desired result. just need to know if
this
>>>>
program can be done in a more readable and less complex way.
>>
>>
The itertools module has many tools that you can use to deal with
problems like the above:
>> ...
>>
>>> from itertools import groupby
>>> wanted == [k for k, g in groupby(lst)]
True
>>>
>>
>>
itertools library; am imagining yellow, if not red, flags.
>>
Taking the term "less complex", let's ask: how many programmers
(including yourself in six months' time) will be able to scan any of
these and conclude/deduce that their objective is the removal of
consecutive duplicates?
>>
At the very least, please enclose your 'brilliance' within a function,
which will give it a (better than a comment) label/name, eg:
>>
def remove_consecutive_duplicates( source ):
return [k for k, g in groupby(lst)]
>>
>>
Ultimately, I favor @Alan's "explicit form" approach (shhhh! Don't tell
him, else he'll fall off his chair...).
>>
Here's me dusting-off an old solution to this problem, which we used to
give ComSc students - updated to (I think) v3.9+:
>>
>>
>>
from typing import Iterable
>>
lst = [2, 3, 3, 4, 5, 5, 3, 7, 9, 9, 4]
result = [2, 3, 4, 5, 3, 7, 9, 4]
>>
def remove_consecutive_duplicates( source:Iterable )->list:
"""Clean input iterable (containing any data-types),
removing consecutive duplicates.
"""
cleaned = list()
current_value = None
for this_element in source:
if this_element != current_value:
cleaned.append( this_element )
current_value = this_element
return cleaned
>>
print( "Correct?", result == remove_consecutive_duplicates( lst ) )
>>
>>
Plus:
>>
remove_consecutive_duplicates( "abba" ) == ['a', 'b', 'a']
>
I am going to suggest you change that into a generator, like
>
def remove_consecutive_duplicates(items):
"""Remove consecutive duplicates.
>
>>> list(remove_consecutive_duplicates(""))
[]
>>> list(remove_consecutive_duplicates("abba"))
['a', 'b', 'a']
>>> list(remove_consecutive_duplicates(
... [1, 1.0, "x", "x", "x", "y", 2.0, 2])
... )
[1, 'x', 'y', 2.0]
>>> list(remove_consecutive_duplicates(iter("aabbbc")))
['a', 'b', 'c']
>>> list(remove_consecutive_duplicates([None]*3))
[None]
"""
it = iter(items)
try:
prev = next(it)
except StopIteration:
return
yield prev
for item in it:
if prev != item:
yield item
prev = item

Like it!
(not sure why we're beating-up Chris though)

...except that the OP seemed to want a list as the result.

That said, I've adopted a policy of converting older utility-functions
to generators.

Thus, new extension to the question:
Received-wisdom says that the generator will require less storage
(although with the external/surrounding list() I'm wondering if that is
true in-fact in this (rather artificial) case); but which approach
executes faster?
a/ appended list,
b/ generator - with  the surrounding list(),
c/ generator - without the surrounding list().
```