[Tutor] Best method for filtering lists in lists...
John Fouhy
john at fouhy.net
Wed Dec 20 02:59:16 CET 2006
On 17/12/06, Alan Gauld <alan.gauld at btinternet.com> wrote:
> As for Luke's point about the speed of conversion, I'm not
> sure how that would stack up. I have a gut feel that it might
> be faster than trying to compare the lists element by
> element, that sounds like an algebraic expansion with list
> size to me, a hashed set sounds like it should be faster.
> But as with all such things, if speed matters: measure
> and profile it.
Here's my attempt at a "smarter" way of removing duplicates:
(note that this code assumes 'None' does not occur in your lists)
def nodupes(lst):
trie = {}
for l in lst:
addToTrie(l, trie)
return unTrie([], trie)
def addToTrie(lst, trie):
curr = trie
for i, elem in enumerate(lst):
try:
curr = curr[elem]
except KeyError:
for e in lst[i:]:
curr[e] = {}
curr = curr[e]
curr[None] = None
break
def unTrie(prefix, trie):
lst = []
for elem in trie:
if elem is None:
lst.append(prefix)
else:
lst.extend(unTrie(prefix + [elem], trie[elem]))
return lst
According to timeit, this is over five times slower than the one-line solution:
def nodupes2(lst):
return [list(y) for y in set(tuple(x) for x in lst)]
Finally, neither of these are order-preserving...
--
John.
More information about the Tutor
mailing list