I think you all should get together and come up with a good implementation, and then petition Raymond Hettinger. Or maybe there is an existing open source 3rd party project that has code you can copy? I don’t recall if random has a C accelerator, but if it does, you should come up with C code as well.

—Guido

On Mon, Jul 13, 2020 at 05:40 David Mertz <mertz@gnosis.cx> wrote:

If we get this function (which I would like), the version with k items (default 1) is much better. Some iterators cannot be repeated at all, so not only is it slower to call multiple times if you need k>1, it's impossible.

On Mon, Jul 13, 2020, 8:37 AM David Mertz <mertz@gnosis.cx> wrote:
This is an inefficient reservoir sampling. The optimized version does not need to call a random inclusion switch on every element, but can skip a geometrically ordered collection of (random) skip lengths to avoid most random inclusion decisions.

Obviously, all items must be iterated over no matter what, but if randrange() is significant compared to the cost of next(), the skip-then-decide version is much preferred, especially as size grows.

On Mon, Jul 13, 2020, 7:53 AM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
I posted this in the thread about indexing dict items but it seems to
have been buried there so I'm starting a new thread.

Maybe there could be a function in the random module for making a
random choice from an arbitrary (finite) iterable. This implementation
can choose a random element from an iterable by fully iterating over
it so is O(N) in terms of CPU operations but O(1) for memory usage:

[,,,]