[Tutor] removing consecutive duplicates from list

Wed Apr 21 05:56:24 EDT 2021

On 21/04/2021 19.39, Peter Otten wrote:
> On 21/04/2021 01:10, dn via Tutor wrote:
>> On 21/04/2021 05.27, Peter Otten wrote:
>>> On 20/04/2021 18:48, Manprit Singh wrote:
>>>
>>>> Consider a list given below:
>>>> lst = [2, 3, 3, 4, 5, 5, 3, 7, 9, 9, 4]
>>>> i need to remove consecutive duplicates from list lst:
>>>> the answer must be :
>>>>
>>>> [2, 3, 4, 5, 3, 7, 9, 4]
>>>>
>>>> The code that i have written to solve it, is written below:
>>>> lst = [2, 3, 3, 4, 5, 5, 3, 7, 9, 9, 4]
>>>> ls = lst[1:]+[object()]
>>>> [x for x, y in zip(lst, ls) if x != y]
>>>>
>>>> The list comprehension gives the desired result. just need to know if
>>>> this
>>>>
>>>> program can be done in a more readable and less complex way.
>>
>>
>>> The itertools module has many tools that you can use to deal with
>>> problems like the above:
>> ...
>>
>>>>>> from itertools import groupby
>>>>>> wanted == [k for k, g in groupby(lst)]
>>> True
>>>
>>> Pick your favourite ;)
>>
>>
>> Whilst, admiring your (@Manprit) one-liner and @Peter's grasp of the
>> itertools library; am imagining yellow, if not red, flags.
>>
>> Taking the term "less complex", let's ask: how many programmers
>> (including yourself in six months' time) will be able to scan any of
>> these and conclude?deduce that their objective is the removal of
>> consecutive duplicates?
>>
>> At the very least, please enclose your 'brilliance' within a function,
>> which will give it a (better than a comment) label/name, eg:
>>
>> def remove_consecutive_duplicates( source ):
>>      return [k for k, g in groupby(lst)]
>>
>>
>> Ultimately, I favor @Alan's "explicit form" approach (shhhh! Don't tell
>> him, else he'll fall off his chair...).
>>
>> Here's me dusting-off an old solution to this problem, which we used to
>> give ComSc students - updated to (I think) v3.9+:
>>
>>
>>
>> from typing import Iterable
>>
>> lst = [2, 3, 3, 4, 5, 5, 3, 7, 9, 9, 4]
>> result = [2, 3, 4, 5, 3, 7, 9, 4]
>>
>> def remove_consecutive_duplicates( source:Iterable )->list:
>>      """Clean input iterable (containing any data-types),
>>         removing consecutive duplicates.
>>      """
>>      cleaned = list()
>>      current_value = None
>>      for this_element in source:
>>          if this_element != current_value:
>>              cleaned.append( this_element )
>>              current_value = this_element
>>      return cleaned
>>
>> print( "Correct?", result == remove_consecutive_duplicates( lst ) )
>>
>>
>> Plus:
>>
>> remove_consecutive_duplicates( "abba" ) == ['a', 'b', 'a']
> 
> I am going to suggest you change that into a generator, like
> 
> def remove_consecutive_duplicates(items):
>    """Remove consecutive duplicates.
> 
>    >>> list(remove_consecutive_duplicates(""))
>    []
>    >>> list(remove_consecutive_duplicates("abba"))
>    ['a', 'b', 'a']
>    >>> list(remove_consecutive_duplicates(
>    ... [1, 1.0, "x", "x", "x", "y", 2.0, 2])
>    ... )
>    [1, 'x', 'y', 2.0]
>    >>> list(remove_consecutive_duplicates(iter("aabbbc")))
>    ['a', 'b', 'c']
>    >>> list(remove_consecutive_duplicates([None]*3))
>    [None]
>    """
>    it = iter(items)
>    try:
>        prev = next(it)
>    except StopIteration: I hate you, Chris A. ;)
>        return
>    yield prev
>    for item in it:
>        if prev != item:
>            yield item
>            prev = item

Like it!
(not sure why we're beating-up Chris though)

...except that the OP seemed to want a list as the result.

That said, I've adopted a policy of converting older utility-functions
to generators.

Thus, new extension to the question:
Received-wisdom says that the generator will require less storage
(although with the external/surrounding list() I'm wondering if that is
true in-fact in this (rather artificial) case); but which approach
executes faster?
a/ appended list,
b/ generator - with  the surrounding list(),
c/ generator - without the surrounding list().
-- 
Regards,
=dn

-- 
Regards,
=dn