For loop searching takes too long!

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Thu Jan 28 20:00:52 EST 2010


On Thu, 28 Jan 2010 15:52:14 -0800, elsa wrote:

> Now, what I need to do is randomly choose one myList[i], however the
> distribution of my random choice needs to be proportional to the values
> of myList[i][0]. So, for this list above, I'd have a much higher chance
> of choosing myList[0] than myList[1].
>
> Here is how I'm doing it at the moment:
> 
> def chooseI(myList):
> 	mySum=0
> 	choice = random.choice(range(1,sum([i[0] for i in myList])+1))
>       for i in range(len(myList)):
> 		mySum+=myList[i][0]
> 		if mySum>=choice:
> 			return i
> 			break

This isn't related to your problem, but you don't need the break after 
the return -- the return will leave the loop and the function, and the 
break will never be reached.

You could probably speed the code up a little by changing all the calls 
to range into xrange. range actually generates a list of integers, which 
is time consuming, while xrange is a lazy generator which just produces 
each integer one at a time.  (You can ignore this if you are using Python 
3.0 or 3.1.)

Another small optimization you could use is to use a generator expression 
instead of a list comprehension inside the call to sum. That should 
generate the sum lazily, instead of calculating a giant list first and 
then adding it up.

But I'd try something like this:

# Untested.
def chooseI(myList):
    total = sum(i[0] for i in myList)
    mySum = 0
    choice = random.randrange(total)
    for i in myList:
        mySum += i[0]
        if mySum >= choice:
            return i



-- 
Steven



More information about the Python-list mailing list