Re: [Python-ideas] random.choice on non-sequence

13 Apr 2016


      [Chris Barker - NOAA Federal <chris.barker@noaa.gov>]
...
...
Anyway, the hash table has gaps, but wouldn't it be sufficiently
random to pick a random index, and if it's a gap, pick another one?
That would be a correct implementation, but ...
...
I suppose in theory, this could be in infinite process, but in practice,
it would be O(1) with a constant of two or three... Better than
iterating through on average half of the keys.
There's an upper limit on how dense a CPython dict or set can become
(the load factor doesn't exceed 2/3), but no lower limit.  For
example, it's easy to end up with a dict holding a single entry hiding
among millions of empty slots (dicts are never resized on key
deletion, only on key insertion).
...
...
BTW, isn't it impossible to randomly select from an infinite iterable anyway?
Of course, but it is possible to do uniform random selection, in one
pass using constant space and in linear time, from an iterable whose
length isn't known in advance (simple case of "reservoir sampling").

Re: [Python-ideas] random.choice on non-sequence

Tim Peters