[Python-ideas] random.choice on non-sequence
Tim Peters
tim.peters at gmail.com
Wed Apr 13 14:04:46 EDT 2016
[Tim]
>> There's an upper limit on how dense a CPython dict or set can become
>> (the load factor doesn't exceed 2/3), but no lower limit. For
>> example, it's easy to end up with a dict holding a single entry hiding
>> among millions of empty slots (dicts are never resized on key
>> deletion, only on key insertion).
[Chris Barker - NOAA Federal <chris.barker at noaa.gov>]
> Easy, yes. Common? I wonder.
Depends on the app. I doubt it's common across all apps.
> If it were common then wouldn't there be good reason to resize the
> hash table when that occurred? Aside from being able to select random
> items, of course...
Shrinking the table would have no primary effect on speed of
subsequent access or deletion - it's O(1) expected regardless.
Shrinking the table is expensive when it's done (the entire object has
to be traversed, and the items reinserted one by one, into a smaller
dict/set).
Regardless, I'd be loathe to add a conditional branch on every
deletion to check. Nothing is free, and the check would be a waste of
cycles every time for apps that don't care about O(1) uniform random
dict/set selection. Note that I don't care about the "wasted" memory,
though. But then I don't care about O(1) uniform random dict/set
selection either ;-)
More information about the Python-ideas
mailing list