fastest method to choose a random element

Paddy paddy3118 at googlemail.com
Sat Jan 5 01:37:16 EST 2008


On Jan 4, 7:55 pm, c... at mailinator.com wrote:
>   Hello,
>   This is a question for the best method (in terms of performance
> only) to choose a random element from a list among those that satisfy
> a certain property.
>
>   This is the setting: I need to pick from a list a random element
> that satisfies a given property. All or none of the elements may have
> the property. Most of the time, many of the elements will satisfy the
> property, and the property is a bit expensive to evaluate. Chance of
> having the property are uniform among elements.
>
>   A simple approach is:
>
> import random
> def random_pick(a_list,property):
>     '''Returns a random element from a list that has the property
>
>     Returns None if no element  has the property
>     '''
>     random.shuffle(a_list)
>     for i in a_list:
>         if property(i): return i
>
> but that requires to shuffle the list every time.
>
>  A second approach, that works if we know that at least one element of
> the list has the property, is:
>
> import random
> def random_pick(a_list,property):
>     '''Returns a random element from a list that has the property
>
>     Loops forever if no element has the property
>     '''
>     while 1:
>         i=random.choice(a_list)
>         if property(i): return i
>
> which is more efficient (on average) if many elements of the list have
> the property and less efficient if only few elements of the list has
> the property (and goes crazy if no element has the property)
>
> Yet another one:
>
> import random
> def random_pick(a_list,property):
>     '''Returns a random element from a list that has the property
>     '''
>     b_list=[x for  x in a_list if property(x)]
>     try:
>         return random.choice(b_list)
>     finally: return None
>
> but this one checks the property on all the elements, which is no
> good.
>
> I don't need strong random numbers, so a simple solution like:
> import random
> globalRNG=random.Random()
>
> def random_pick(a_list,property):
>     '''Returns a random element from a list that has the property
>
>     Works only if len(a_list)+1 is prime: uses Fermat's little theorem
>     '''
>     a=globalRNG(1,len(a_list))
>     ind=a
>     for i in xrange(len(a_list)):
>         x=a_list[a-1]
>         if property(x):return x
>         ind*=a
>
> but this works only if len(a_list)+1 is prime!!! Now this one could be
> saved if there were an efficient method to find a prime number given
> than a number n but not very much greater...
>
> Any other ideas? Thanks everybody

Caching might help.

If random_pick is called several times with the same list(s) then
cache the result of
 [property(i) for i in a_list] against a_list.

If random_pick is called several times with list(s) whith multiple
instances of list items then cache
 property(i) against i for i in a_list .

You could do both.

You might investigate wether property(i) could be implemented using a
faster algorithm, maybe table lookup, or interpolation from initial
table lookup.

- Paddy.



More information about the Python-list mailing list