fastest method to choose a random element
Paddy
paddy3118 at googlemail.com
Sat Jan 5 01:37:16 EST 2008
On Jan 4, 7:55 pm, c... at mailinator.com wrote:
> Hello,
> This is a question for the best method (in terms of performance
> only) to choose a random element from a list among those that satisfy
> a certain property.
>
> This is the setting: I need to pick from a list a random element
> that satisfies a given property. All or none of the elements may have
> the property. Most of the time, many of the elements will satisfy the
> property, and the property is a bit expensive to evaluate. Chance of
> having the property are uniform among elements.
>
> A simple approach is:
>
> import random
> def random_pick(a_list,property):
> '''Returns a random element from a list that has the property
>
> Returns None if no element has the property
> '''
> random.shuffle(a_list)
> for i in a_list:
> if property(i): return i
>
> but that requires to shuffle the list every time.
>
> A second approach, that works if we know that at least one element of
> the list has the property, is:
>
> import random
> def random_pick(a_list,property):
> '''Returns a random element from a list that has the property
>
> Loops forever if no element has the property
> '''
> while 1:
> i=random.choice(a_list)
> if property(i): return i
>
> which is more efficient (on average) if many elements of the list have
> the property and less efficient if only few elements of the list has
> the property (and goes crazy if no element has the property)
>
> Yet another one:
>
> import random
> def random_pick(a_list,property):
> '''Returns a random element from a list that has the property
> '''
> b_list=[x for x in a_list if property(x)]
> try:
> return random.choice(b_list)
> finally: return None
>
> but this one checks the property on all the elements, which is no
> good.
>
> I don't need strong random numbers, so a simple solution like:
> import random
> globalRNG=random.Random()
>
> def random_pick(a_list,property):
> '''Returns a random element from a list that has the property
>
> Works only if len(a_list)+1 is prime: uses Fermat's little theorem
> '''
> a=globalRNG(1,len(a_list))
> ind=a
> for i in xrange(len(a_list)):
> x=a_list[a-1]
> if property(x):return x
> ind*=a
>
> but this works only if len(a_list)+1 is prime!!! Now this one could be
> saved if there were an efficient method to find a prime number given
> than a number n but not very much greater...
>
> Any other ideas? Thanks everybody
Caching might help.
If random_pick is called several times with the same list(s) then
cache the result of
[property(i) for i in a_list] against a_list.
If random_pick is called several times with list(s) whith multiple
instances of list items then cache
property(i) against i for i in a_list .
You could do both.
You might investigate wether property(i) could be implemented using a
faster algorithm, maybe table lookup, or interpolation from initial
table lookup.
- Paddy.
More information about the Python-list
mailing list