[Python-ideas] random.choice on non-sequence

Tim Peters tim.peters at gmail.com
Wed Apr 13 14:04:46 EDT 2016


[Tim]
>> There's an upper limit on how dense a CPython dict or set can become
>> (the load factor doesn't exceed 2/3), but no lower limit.  For
>> example, it's easy to end up with a dict holding a single entry hiding
>> among millions of empty slots (dicts are never resized on key
>> deletion, only on key insertion).

[Chris Barker - NOAA Federal <chris.barker at noaa.gov>]
> Easy, yes. Common? I wonder.

Depends on the app.  I doubt it's common across all apps.


> If it were common then wouldn't there be good reason to resize the
> hash table when that occurred? Aside from being able to select random
> items, of course...

Shrinking the table would have no primary effect on speed of
subsequent access or deletion - it's O(1) expected regardless.
Shrinking the table is expensive when it's done (the entire object has
to be traversed, and the items reinserted one by one, into a smaller
dict/set).

Regardless, I'd be loathe to add a conditional branch on every
deletion to check.  Nothing is free, and the check would be a waste of
cycles every time for apps that don't care about O(1) uniform random
dict/set selection.  Note that I don't care about the "wasted" memory,
though.  But then I don't care about O(1) uniform random dict/set
selection either ;-)


More information about the Python-ideas mailing list