item access time: sets v. lists
duncan.booth at invalid.invalid
Wed Oct 4 22:14:33 CEST 2006
"sjdevnull at yahoo.com" <sjdevnull at yahoo.com> wrote:
> A set, on the other hand, uses a hash table so finding an element
> takes constant time (it's one hash lookup, independent of the size of
> the set)--and determining an item isn't there is likewise constant
One hash calculation, but there could be collisions. Worst case for an n
element dict/set could be that it takes n attempts to find a value,
although unless you try to do it deliberately by ensuring that the keys
have identical hash values this won't happen in practice.
Paul McGuire wrote:
> Thanks, I stand corrected. How do they know how big a hash table to
> use? I think this is an interesting problem.
If I read the code correctly:
Minimum size is 8. Whenever more than 2/3 of the slots are in use
(including slots where the element has been deleted) the dictionary is
resized to the smallest power of 2 (greater than 8) which is greater than 4
times the number of currently occupied slots (or 2 times the number of
occupied slots when more than 50000 slots are occupied). This can reduce
the size if a lot of elements have been deleted.
More information about the Python-list