[Python-ideas] Uniquify attribute for lists

Andrew Barnert abarnert at yahoo.com
Fri Nov 16 21:14:17 CET 2012


From: Andrew Barnert <abarnert at yahoo.com>
Sent: Fri, November 16, 2012 11:15:18 AM

>All that being said, if getting this right is difficult enough that a bunch of 
>people working together on a blog over 6 years didn't come up with a good 
>version that supports non-hashable elements, maybe a good implementation does 
>belong in the standard library itertools.

Actually, it looks like it's already there. The existing unique_everseen 
function in http://docs.python.org/3/library/itertools.html#itertools-recipes 
(also available from the more-itertools PyPI module at 
http://packages.python.org/more-itertools/api.html#more_itertools.unique_everseen)
 is an improvement on this idea.

So, unless someone has done performance tests showing that the suggested 
implementation is significantly faster than unique_everseen (I suppose the 
"__contains__" vs. "in" might make a difference?), and this is a critical 
bottleneck for your app, I think the right way to write this function is:

    uniquify = more_itertools.unique_everseen

Unfortunately, it's still not going to work on non-hashable elements. Maybe 
itertools (either the module or the documentation recipe list) needs a version 
that does?



More information about the Python-ideas mailing list