finding items that occur more than once in a list
fakeaddress at nowhere.org
Fri Mar 21 10:17:10 CET 2008
Arnaud Delobelle wrote:
> Bryan Olson wrote:
>> Arnaud Delobelle offered a good Wikipedia link, and for more background
>> look up "amortized analysis.
> Hrvoje Niksic provided the link :).
Oops, careless thread-following. Hrvoje Niksic it was.
> I still think two unrelated
> things are being compared in this thread when people say that method A
> (using dictionaries / sets) is O(n) and method B (sorting the list)
> Isn't it the case that:
> | Worst case | Average case
> Method A | O(n^2) | O(n)
> Method B | O(nlogn) | O(nlogn)
> Which method is the best would then be determined by the distribution
> of the hash values of your data, and whether you want to have a
> guarantee the method will have a less-than-quadratic behaviour.
If we exclude the case where an adversary is choosing the keys, the
chance of a seriously degenerate case in the hashing method is so
remote that we do should not worry about it. Those who insist on
ranking by worst-case behavior have probably abandoned Python long
ago, as it uses those hashed 'dict' things everywhere. Of course
they've also abandoned OS's with demand-paged virtual memory and
More information about the Python-list