[Python-ideas] random.choice on non-sequence

Guido van Rossum guido at python.org
Tue Apr 12 22:11:38 EDT 2016


This is no good. Who wants a choice() that is O(1) on sequences but
degrades to O(N) if the argument is a set?

I see the current behavior as a feature: it works when the argument is
indexable, and when it isn't, it fails cleanly, prompting the user to
use their brain and decide what's right. Maybe you're programming a
toy example. Then you can just call random.choice(list(x)) and move
on. Or maybe you're trying to do something serious -- then it behooves
you to copy the set into a list variable once and then repeatedly
choose from that list, or maybe you should have put the data in a list
in the first place.

But putting the toy solution in the stdlib is a bad idea, and so is
putting a bad (O(N)) algorithm there.

So the status quo wins for a reason!

--Guido

On Tue, Apr 12, 2016 at 7:02 PM, Chris Angelico <rosuav at gmail.com> wrote:
> On Wed, Apr 13, 2016 at 7:40 AM, Rob Cliffe <rob.cliffe at btinternet.com> wrote:
>> And here is my proposed (tested) version:
>>
>>     def choice(self, it):
>>         """Choose a random element from a non-empty iterable."""
>>         try:
>>             it[0]
>>         except IndexError:
>>             raise IndexError('Cannot choose from an empty sequence')
>>         except TypeError:
>>             it = tuple(it)
>>         try:
>>             i = self._randbelow(len(it))
>>         except ValueError:
>>             raise IndexError('Cannot choose from an empty iterable')
>>         return it[i]
>>
>>
>> This works on (e.g.) a list/tuple/string, a set, a dictionary view or a
>> generator.
>> Obviously the generator has to be 'fresh' (i.e. any previously consumed
>> values will be excluded from the choice) and 'throw-away' (random.choice
>> will exhaust it).
>> But it means you can write code like
>>     random.choice(x for x in xrange(10) if x%3)    # this feels quite
>> "Pythonic" to me!
>
> Small point of order: Pretty much everything discussed here on
> python-ideas is about Python 3. It's best to make sure your code works
> with the latest CPython (currently 3.5), as a change like this would
> be landing in 3.6 at the earliest. So what I'd be looking at is this:
>
> random.choice(x for x in range(10) if x%3)
>
> AFAIK this doesn't change your point at all, but it is worth being
> careful of; Python 3's range object isn't quite the same as Python 2's
> xrange, and it's possible something might "just work". (For the
> inverse case, "if x%3 == 0", you can simply use
> random.choice(random(0, 10, 3)) to do what you want.)
>
> I don't like the proposed acceptance of arbitrary iterables. In the
> extremely rare case where you actually do want that, you can always
> manually wrap it in list() or tuple(). But your original use-case does
> have merit:
>
>> It surprised me a bit the first time I realised that random.choice did not work on a set.
>
> A set has a length, but it can't be indexed. It should be perfectly
> reasonable to ask for a random element out of a set! So here's my
> counter-proposal: Make random.choice accept any iterable with a
> __len__.
>
>     def choice(self, coll):
>         """Choose a random element from a non-empty collection."""
>         try:
>             i = self._randbelow(len(coll))
>         except ValueError:
>             raise IndexError('Cannot choose from an empty collection')
>         try:
>             return coll[i]
>         except TypeError:
>             for _, value in zip(range(i+1), coll):
>                 pass
>             return value
>
> Feel free to bikeshed the method of iterating part way into a
> collection (enumerate() and a comparison? call iter, then call next
> that many times, then return next(it)?), but the basic concept is that
> you have to have a length and then you iterate into it that far.
>
> It's still not arbitrary iterables, but it'll handle sets and dict
> views. Handling dicts directly may be a little tricky; since they
> support subscripting, they'll either raise KeyError or silently return
> a value, where the iteration-based return value would be a key.
> Breaking this out into its own function would be reliable there (take
> out the try/except and just go straight into iteration).
>
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
--Guido van Rossum (python.org/~guido)


More information about the Python-ideas mailing list