[Tutor] removing consecutive duplicates from list
dn
PyTutor at DancesWithMice.info
Wed Apr 21 05:56:24 EDT 2021
On 21/04/2021 19.39, Peter Otten wrote:
> On 21/04/2021 01:10, dn via Tutor wrote:
>> On 21/04/2021 05.27, Peter Otten wrote:
>>> On 20/04/2021 18:48, Manprit Singh wrote:
>>>
>>>> Consider a list given below:
>>>> lst = [2, 3, 3, 4, 5, 5, 3, 7, 9, 9, 4]
>>>> i need to remove consecutive duplicates from list lst:
>>>> the answer must be :
>>>>
>>>> [2, 3, 4, 5, 3, 7, 9, 4]
>>>>
>>>> The code that i have written to solve it, is written below:
>>>> lst = [2, 3, 3, 4, 5, 5, 3, 7, 9, 9, 4]
>>>> ls = lst[1:]+[object()]
>>>> [x for x, y in zip(lst, ls) if x != y]
>>>>
>>>> The list comprehension gives the desired result. just need to know if
>>>> this
>>>>
>>>> program can be done in a more readable and less complex way.
>>
>>
>>> The itertools module has many tools that you can use to deal with
>>> problems like the above:
>> ...
>>
>>>>>> from itertools import groupby
>>>>>> wanted == [k for k, g in groupby(lst)]
>>> True
>>>
>>> Pick your favourite ;)
>>
>>
>> Whilst, admiring your (@Manprit) one-liner and @Peter's grasp of the
>> itertools library; am imagining yellow, if not red, flags.
>>
>> Taking the term "less complex", let's ask: how many programmers
>> (including yourself in six months' time) will be able to scan any of
>> these and conclude?deduce that their objective is the removal of
>> consecutive duplicates?
>>
>> At the very least, please enclose your 'brilliance' within a function,
>> which will give it a (better than a comment) label/name, eg:
>>
>> def remove_consecutive_duplicates( source ):
>> return [k for k, g in groupby(lst)]
>>
>>
>> Ultimately, I favor @Alan's "explicit form" approach (shhhh! Don't tell
>> him, else he'll fall off his chair...).
>>
>> Here's me dusting-off an old solution to this problem, which we used to
>> give ComSc students - updated to (I think) v3.9+:
>>
>>
>>
>> from typing import Iterable
>>
>> lst = [2, 3, 3, 4, 5, 5, 3, 7, 9, 9, 4]
>> result = [2, 3, 4, 5, 3, 7, 9, 4]
>>
>> def remove_consecutive_duplicates( source:Iterable )->list:
>> """Clean input iterable (containing any data-types),
>> removing consecutive duplicates.
>> """
>> cleaned = list()
>> current_value = None
>> for this_element in source:
>> if this_element != current_value:
>> cleaned.append( this_element )
>> current_value = this_element
>> return cleaned
>>
>> print( "Correct?", result == remove_consecutive_duplicates( lst ) )
>>
>>
>> Plus:
>>
>> remove_consecutive_duplicates( "abba" ) == ['a', 'b', 'a']
>
> I am going to suggest you change that into a generator, like
>
> def remove_consecutive_duplicates(items):
> """Remove consecutive duplicates.
>
> >>> list(remove_consecutive_duplicates(""))
> []
> >>> list(remove_consecutive_duplicates("abba"))
> ['a', 'b', 'a']
> >>> list(remove_consecutive_duplicates(
> ... [1, 1.0, "x", "x", "x", "y", 2.0, 2])
> ... )
> [1, 'x', 'y', 2.0]
> >>> list(remove_consecutive_duplicates(iter("aabbbc")))
> ['a', 'b', 'c']
> >>> list(remove_consecutive_duplicates([None]*3))
> [None]
> """
> it = iter(items)
> try:
> prev = next(it)
> except StopIteration: I hate you, Chris A. ;)
> return
> yield prev
> for item in it:
> if prev != item:
> yield item
> prev = item
Like it!
(not sure why we're beating-up Chris though)
...except that the OP seemed to want a list as the result.
That said, I've adopted a policy of converting older utility-functions
to generators.
Thus, new extension to the question:
Received-wisdom says that the generator will require less storage
(although with the external/surrounding list() I'm wondering if that is
true in-fact in this (rather artificial) case); but which approach
executes faster?
a/ appended list,
b/ generator - with the surrounding list(),
c/ generator - without the surrounding list().
--
Regards,
=dn
--
Regards,
=dn
More information about the Tutor
mailing list