The problem with your version is that copying the input is slow if it is large.
Raymond was referencing the top answer here:
http://stackoverflow.com/questions/2394246/algorithm-to-select-a-single-rand....
It's also slow though (draws N random values if the input is of length
N).
On Tue, Apr 12, 2016 at 2:40 PM, Rob Cliffe
It surprised me a bit the first time I realised that random.choice did not work on a set. (One expects "everything" to "just work" in Python! :-) ) There is a simple workaround (convert the set to a tuple or list before passing it) but ...
On 2011-06-23, Sven Marnach posted 'A few suggestions for the random module' on Python-Ideas, one of which was to allow random.choice to work on an arbitrary iterable. Raymond Hettinger commented "It could be generalized to arbitrary iterables (Bentley provides an example of how to do this) but it is fragile (i.e. falls apart badly with weak random number generators) and doesn't correspond well with real use cases." Sven replied "Again, there definitely are real world use cases ..."
I would like to revive this suggestion. I don't understand Raymond's comment. It seems to me that a reasonable implementation is better and more friendly than none, particularly for newbies. This is the source code for random.choice in Python 3.2 (as far as I know it hasn't changed since):
def choice(self, seq): """Choose a random element from a non-empty sequence.""" try: i = self._randbelow(len(seq)) except ValueError: raise IndexError('Cannot choose from an empty sequence') return seq[i]
And here is my proposed (tested) version:
def choice(self, it): """Choose a random element from a non-empty iterable.""" try: it[0] except IndexError: raise IndexError('Cannot choose from an empty sequence') except TypeError: it = tuple(it) try: i = self._randbelow(len(it)) except ValueError: raise IndexError('Cannot choose from an empty iterable') return it[i]
This works on (e.g.) a list/tuple/string, a set, a dictionary view or a generator. Obviously the generator has to be 'fresh' (i.e. any previously consumed values will be excluded from the choice) and 'throw-away' (random.choice will exhaust it). But it means you can write code like random.choice(x for x in xrange(10) if x%3) # this feels quite "Pythonic" to me!
(I experimented with versions that, when passed an object that supported len() but not indexing, viz. a set, iterated through the object as far as necessary instead of converting it to a list. But I found that this was slower in practice as well as more complicated. There might be a memory issue with huge iterables, but we are no worse off using existing workarounds for those.)
Downsides of proposed version: (1) Backward compatibility. Could mask an error if a "wrong" object, but one that happens to be an iterable, is passed to random.choice. There is also slightly different behaviour if a dictionary (D) is passed (an unusual thing to do): The current version will get a random integer in the range [0, len(D)), try to use that as a key of D, and raise KeyError if that fails. The proposed version behaves similarly except that it will always raise "KeyError: 0" if 0 is not a key of D. One could argue that this is an improvement, if it matters at all (error reproducibility). And code that deliberately exploits the existing behaviour, provided that it uses a dictionary whose keys are consecutive integers from 0 upwards, e.g. D = { 0 : 'zero', 1 : 'one', 2 : 'two', 3 : 'three', 4 : 'four' } d = random.choice(D) # equivalent, in this example, to "d = random.choice(list(D.values()))" if d == 'zero': ... will continue to work (although one might argue that it deserves not to). (2) Speed - the proposed version is almost certainly slightly slower when called with a sequence. For what it's worth, I tried to measure the difference on my several-years-old Windows PC, but is was hard to measure accurately, even over millions of iterations. All I can say is that it appeared to be a small fraction of a millisecond, possibly in the region of 50 nanoseconds, per call.
Best wishes Rob Cliffe _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido)