delete duplicates in list
Alex Martelli
aleax at aleax.it
Thu Oct 30 05:20:37 EST 2003
Hannu Kankaanp?? wrote:
...
>> So - he'll have a list anyway.
>
> So I can't agree with this. You don't know if his Python virtual machine
> is a group of monkeys. Python is supposed to be a high level language.
So, in that case those monkeys would surely be perched up on tall trees.
>> Which is of course reasonable, as the check for existence in the set
>> might be performed in O(ln n) instead of O(n).
>
> Actually the Set in sets module has an average lookup of O(1), worst
> case O(n) (not 100% sure of worst case, but 99% sure). It's been
Hmmm -- could you give an example of that worstcase...? a _full_
hashtable would give such behavior, but Python's dicts always ensure
the underlying hashtables aren't too full...
> implemented with dictionaries, which in turn are hash tables.
Yep - check it out...:
[alex at lancelot bo]$ timeit.py -c -s'import sets' -s'x=sets.Set(xrange(100))'
'7 in x'
100000 loops, best of 3: 2.3 usec per loop
[alex at lancelot bo]$ timeit.py -c -s'import sets'
-s'x=sets.Set(xrange(1000))' '7 in x'
100000 loops, best of 3: 2.3 usec per loop
[alex at lancelot bo]$ timeit.py -c -s'import sets'
-s'x=sets.Set(xrange(10000))' '7 in x'
100000 loops, best of 3: 2.2 usec per loop
[alex at lancelot bo]$ timeit.py -c -s'import sets'
-s'x=sets.Set(xrange(100000))' '7 in x'
100000 loops, best of 3: 2.2 usec per loop
[alex at lancelot bo]$ timeit.py -c -s'import sets'
-s'x=sets.Set(xrange(1000000))' '7 in x'
100000 loops, best of 3: 2.3 usec per loop
see the pattern...?-)
Same when something is NOT there, BTW:
[alex at lancelot bo]$ timeit.py -c -s'import sets'
-s'x=sets.Set(xrange(1000000))' 'None in x'
100000 loops, best of 3: 2.3 usec per loop
etc, etc.
More information about the Python-list
mailing list