[Tutor] Ways of removing consequtive duplicates from a list
Peter Otten
__peter__ at web.de
Mon Jul 18 03:15:14 EDT 2022
On 17/07/2022 18:59, avi.e.gross at gmail.com wrote:
> You could make the case, Peter, that you can use anything as a start that
> will not likely match in your domain. You are correct if an empty string may
> be in the data.
>
> Now an object returned by object is pretty esoteric and ought to be rare and
> indeed each new object seems to be individual.
>
> val=object()
>
> [(val := ele) for ele in [1,1,2,object(),3,3,3] if ele != val]
> -->> [1, 2, <object object at 0x00000176F33150D0>, 3]
>
> So the only way to trip this up is to use the same object or another
> reference to it where it is silently ignored.
When you want a general solution for removal of consecutive duplicates
you can put the line
val = object()
into the deduplication function which makes it *very* unlikely that val
will also be passed as an argument to that function.
To quote myself:
> Manprit avoided that in his similar solution by using a special value
> that will compare false except in pathological cases:
>
>> val = object()
>> [(val := ele) for ele in lst if ele != val]
What did I mean with "pathological"?
One problematic case would be an object that compares equal to everything,
class A:
def __eq__(self, other): return True
def __ne__(self, other): return False
but that is likely to break the algorithm anyway.
Another problematic case: objects that only implement comparison for
other objects of the same type. For these deduplication will work if you
avoid the out-of-band value:
>>> class A:
def __init__(self, name):
self.name = name
def __eq__(self, other): return self.name == other.name
def __ne__(self, other): return self.name != other.name
def __repr__(self): return f"A(name={self.name})"
>>> prev = object()
>>>
>>> [(prev:=item) for item in map(A, "abc") if item != prev]
Traceback (most recent call last):
File "<pyshell#57>", line 1, in <module>
[(prev:=item) for item in map(A, "abc") if item != prev]
File "<pyshell#57>", line 1, in <listcomp>
[(prev:=item) for item in map(A, "abc") if item != prev]
File "<pyshell#54>", line 5, in __ne__
def __ne__(self, other): return self.name != other.name
AttributeError: 'object' object has no attribute 'name'
>>> def rm_duplicates(iterable):
it = iter(iterable)
try:
last = next(it)
except StopIteration:
return
yield last
for item in it:
if item != last:
yield item
last = item
>>> list(rm_duplicates(map(A, "aabccc")))
[A(name=a), A(name=b), A(name=c)]
>>>
More information about the Tutor
mailing list