[Python-ideas] random.choice on non-sequence

Chris Barker - NOAA Federal chris.barker at noaa.gov
Wed Apr 13 11:31:15 EDT 2016


> There's an upper limit on how dense a CPython dict or set can become
> (the load factor doesn't exceed 2/3), but no lower limit.  For
> example, it's easy to end up with a dict holding a single entry hiding
> among millions of empty slots (dicts are never resized on key
> deletion, only on key insertion).

Easy, yes. Common? I wonder. If it were common then wouldn't there be
good reason to resize the hash table when that occurred? Aside from
being able to select random items, of course...

-CHB


>
>> ...
>> BTW, isn't it impossible to randomly select from an infinite iterable anyway?
>
> Of course, but it is possible to do uniform random selection, in one
> pass using constant space and in linear time, from an iterable whose
> length isn't known in advance (simple case of "reservoir sampling").


More information about the Python-ideas mailing list