R: unique items in lists
Remco Gerlich
scarblac at pino.selwerd.nl
Tue Mar 27 08:09:12 EST 2001
comp.lang.python <micampe at f2s.com> wrote in comp.lang.python:
> Here is what I'm doing now:
>
> for w in words:
> while words.count(w) > 1:
> words.remove(w)
>
> comments?
First, you modify a list that you are looping over. That can give
unpredictable results, but in this case it seems to work.
It will take a long time. 'words.remove(w)' takes time proportional to the
length of the list; words.count(w) takes time proportional to the length of
the list itself, and you repeat that for every duplicate of a word, and the
total is repeated for every unique word. In total it takes time proportional
to N^3 (for a list of length N), I think.
The dictionary method does one dictionary action for each word, and then it
does a dict.keys() which takes time proportional to each unique word. In total
it takes time proportional to N.
For a longer list, N^3 will become much much bigger than N, so your solution
is slower, unfortunately.
My function returns a new list and doesn't change the old list, you can do
that with
def make_unique_list(l):
dict = {}
for word in l:
dict[word] = 1
l[:] = dict.keys() # Slice assignment changes the old list
Try them both out, I'd say :). For smaller lists they're both quick, but
when you get to lengths over a few thousand, the difference is obvious.
--
Remco Gerlich
More information about the Python-list
mailing list