[Tutor] Efficiency?

Thu Jul 29 21:24:50 CEST 2004

OK I'll bite.

First I should say I am not an expert at Python optimizations, I tried this 
out for my own learning as much as anything. My understanding is that the 
most important factor in optimizing Python is to move functionality into C 
code by calling builtins as much as possible.

Performance of these functions is going to vary depending on the length of 
the input list and the relative frequency with which the minimum value 
occurs. I wrote a test program that uses the integer value of the 
characters in a text file for the list. The text file is just the index 
file for my web page; if you want to use the same data you can get it at 
http://www.kentsjohnson.com/index.html. The resulting list has 1759 numbers 
in it.

I ran each of these functions 10 times and reported the number of seconds 
it takes.

First, the original poster's function:
def findSmallestNumber1(r):
     return [i for i in range(len(r)) if r[i] == reduce( min, r ) ]

As Blake noted, this is O(n^2) and thus very slow for a long list. On my 
machine it took 9.491670 seconds to run 10 times.

The second attempt takes the reduce() call out of the loop. This is MUCH 
faster, it took 0.009332 seconds
def findSmallestNumber2(r):
     s = reduce( min, r )
     return [i for i in range(len(r)) if r[i] == s ]

There is actually no need for the reduce() call, min() can take a list 
argument directly. This version takes 0.005174 secs:
def findSmallestNumber3(r):
     s = min(r)
     return [i for i in range(len(r)) if r[i] == s ]

OK, now we have Blake's version that makes just one pass through the list. 
It takes 0.016333 secs - longer than the list comprehension versions.
BTW this and the previous two functions are all O(n); big-O notation 
disregards constant factors.

def findSmallestNumber4( r ):
     indices = []
     for index in range(len(r)):
         if len(indices) == 0 or r[index] < r[ indices[0] ]:
             indices = [index]
         elif r[index] == r[ indices[0] ]:
             indices.append( index )
     return indices

findSmallestNumber4() has quite a few inefficiencies itself. The test for 
len(indices)==0 can be eliminated by initializing indices to [0] and 
starting the iteration at 1. The access to r[index] can be cached in a 
variable, and the accesses to r[indices[0]] can be eliminated by caching 
the smallest value itself. This version is faster but still not as fast as 
version 3; it takes 0.007563 secs. There's just too much code here.
def findSmallestNumber5( r ):
     if not r: return []

     indices = [0]
     smallest = r[0]

     for index in range(1, len(r)):
         val = r[index]
         if val < smallest:
             indices = [index]
             smallest = val
         elif val == smallest:
             indices.append( index )
     return indices

Maybe there is a way to avoid iterating in Python? There is a list.index() 
method that returns the index at which a value appears. What if we use it 
to repeatedly search for the minimum value? This works well, taking only 
0.002769 secs!
def findSmallestNumber6(r):
     smallest = min(r)
     indices = []
     i = -1

     try:
         while True:
             i = r.index(smallest, i+1)
             indices.append(i)
     except ValueError:
         pass

     return indices

I did one further optimization by caching the bound methods for r.index and 
indices.append. This does the method lookup just once instead of once each 
time through the loop and squeezes a little more time out. This version 
takes 0.002617 secs:
def findSmallestNumber7(r):
     smallest = min(r)
     indices = []
     i = -1

     indexFn = r.index
     appendFn = indices.append

     try:
         while True:
             i = indexFn(smallest, i+1)
             appendFn(i)
     except ValueError:
         pass

     return indices

That's all for me. Here is the complete output from one run of the program, 
followed by the complete program so you can try it yourself with your own data.

Enjoy!
Kent

Testing with 1759 items in list
findSmallestNumber1: 9.491670 secs
findSmallestNumber2: 0.009332 secs
findSmallestNumber3: 0.005174 secs
findSmallestNumber4: 0.016333 secs
findSmallestNumber5: 0.007563 secs
findSmallestNumber6: 0.002769 secs
findSmallestNumber7: 0.002617 secs

#####################################################
''' Test various ways to find the indices of the minimum values of a list '''

import timeit

# Use some text as the data
data = open('index.html').read()
data = [ ord(c) for c in data ]

print 'Testing with %d items in list' % len(data)

# Original version is O(n^2) because of use of reduce in the loop
def findSmallestNumber1(r):
     return [i for i in range(len(r)) if r[i] == reduce( min, r ) ]

# This version takes the computation of min out of the loop
def findSmallestNumber2(r):
     s = reduce( min, r )
     return [i for i in range(len(r)) if r[i] == s ]

# There is no need to use reduce; min can take a list directly
def findSmallestNumber3(r):
     s = min(r)
     return [i for i in range(len(r)) if r[i] == s ]

# Alternate suggestion makes a single pass through the list
def findSmallestNumber4( r ):
     indices = []
     for index in range(len(r)):
         if len(indices) == 0 or r[index] < r[ indices[0] ]:
             indices = [index]
         elif r[index] == r[ indices[0] ]:
             indices.append( index )
     return indices

# Rework the above to remove an extra test and several redundant calculations
def findSmallestNumber5( r ):
     if not r: return []

     indices = [0]
     smallest = r[0]

     for index in range(1, len(r)):
         val = r[index]
         if val < smallest:
             indices = [index]
             smallest = val
         elif val == smallest:
             indices.append( index )
     return indices

# This version puts the search for indices into C code by using list.index
def findSmallestNumber6(r):
     smallest = min(r)
     indices = []
     i = -1

     try:
         while True:
             i = r.index(smallest, i+1)
             indices.append(i)
     except ValueError:
         pass

     return indices

# Eliminating method lookups gives a slight speedup
def findSmallestNumber7(r):
     smallest = min(r)
     indices = []
     i = -1

     indexFn = r.index
     appendFn = indices.append

     try:
         while True:
             i = indexFn(smallest, i+1)
             appendFn(i)
     except (ValueError, IndexError):
         pass

     return indices

correctAnswer = findSmallestNumber1(data)
reps = 10 # number of reps in timeit

def timeOne(fn):
     # First run the function and check that it gets the correct results
     actualAnswer = fn(data)
     if actualAnswer != correctAnswer:
         print fn.__name__, 'does not give the correct answer'
         return

     # Now time it
     setup = "from __main__ import data," + fn.__name__
     stmt = fn.__name__ + '(data)'

     t = timeit.Timer(stmt, setup)
     secs = t.timeit(reps)
     print '%s: %f secs' % (fn.__name__, secs)

fnsToTest = [
     findSmallestNumber1,
     findSmallestNumber2,
     findSmallestNumber3,
     findSmallestNumber4,
     findSmallestNumber5,
     findSmallestNumber6,
     findSmallestNumber7,
]

for fn in fnsToTest:
     timeOne(fn)

###########################################################
At 09:28 AM 7/29/2004 -0400, Blake Winton wrote:
>Hee-Seng Kye wrote:
>>Hi, I was reading you e-mail again the other day, and I realized that I 
>>wanted to ask you about efficiency of function.
>
>Sure.  (I've renamed the functions so that I can talk about them a little 
>easier.)  (And I think I'm going to Cc: the tutor list, since re-reading 
>my email, I'm touching on some points that other people might find 
>useful...  And hey, maybe the other tutors will time some of these 
>functions, and let us know which actually is faster, and why one might be 
>faster than another.)
>
>>You said the function below is much less efficient though shorter, and I 
>>discovered that the one below is significantly slower, as you said.  Why 
>>is that?  I thought list comprehension is usually faster than 'for' 
>>loops.  Is it because the one below uses 'reduce' function?  The function 
>>above is even appending to an empty list, which I thought slow things 
>>down, and yet much faster.
>
>Well, we can always rewrite the list comprehension as a for loop, which
>will show us where the inefficiency lies.
>
>So this:
> >> >>> def findSmallestNumber1(r):
> >> ...   return [i for i in range(len(r)) if r[i] == reduce( min, r ) ]
>
>turns into:
> >>> def findSmallestNumber2(r):
>...   retval = []
>...   for i in range( len( r ) ):
>...     if r[i] == reduce( min, r ):
>...       retval.append(i)
>...   return retval
>
>The for-loop version of this will be a little slower than the 
>list-comprehension version, but they're both doing much more work than the 
>first version I sent you.  How are they doing more work, you ask? Well, 
>the reduce function has to go through every element of the list every time 
>it's called to find the minimum.  ("reduce(min,r)" basically finds the 
>smallest element in the list "r".)
>
>So the code in findSmallestNumber1 and 2 goes through the whole list once 
>(the "for i in range( len( r ) ):" loop), and then "n" more times (the 
>"reduce( min, r )" call), where "n" is the number of elements in the 
>loop.  If we think of going through the loop once as "n", then this code 
>will go through it "n + n*n" (or "n + n^2", or "n^2 + n") times, which 
>I'll write as O(n^2 + n).  (Do you know Big-O Notation?  Maybe you call it 
>"The Order of a function".  It's a way of describing how fast your code 
>will run, given large inputs.  I'll show you another example below.  If 
>you're impatient, googling for "Big-O Notation" gives you a lot of hits, 
>all of which seem to be on-topic.)
>
>Now, calling reduce runs through the list at the speed of C, which is 
>faster than doing it at the speed of Python, but it's still going to be 
>quite slow if your loop is large.  And even worse, the minimum number 
>isn't going to change between one run and the next, so we should really 
>only calculate it once, and store it in a variable, which we can call, say "s".
>
>And that's essentially what this code does:
> >> >>> def findSmallestNumber3(r):
> >> ...   s = reduce( min, r )
> >> ...   return [i for i in range(len(r)) if r[i] == s ]
>
>or its for-loop version:
> >>> def findSmallestNumber3(r):
>...   retval = ""
>...   s = reduce( min, r )
>...   for i in range(len(r)):
>...     if r[i] == s:
>...       retval.append(i)
>...   return retval
>
>This will run through the loop once (the "reduce( min, r )" call), and 
>then once more (the "for i in range(len(r)):" loop), for a total of 2 
>times, which I'm going to write as O(2n).
>
>>When you say the one below is a little more efficient than the above, do 
>>you mean that the one below is more clear to read?  Or do you mean the 
>>one below runs a little faster?  I don't understand how the one above is 
>>different from the one below, except for a matter of notation.
>
>It should run a little faster.  Perhaps not a lot faster, perhaps a lot 
>faster.  Intuitively, I think it would depend on the length of the 
>list.  If you time it, let me know.
>
>And do you understand why it should run faster?
>
>And let's go back to the first example:
> >> >>> def findSmallestNumber0( r ):
> >> ...   indices = []
> >> ...   for index in range(len(r)):
> >> ...     if len(indices) == 0 or r[index] < r[ indices[0] ]:
> >> ...       indices = [index]
> >> ...     elif r[index] == r[ indices[0] ]:
> >> ...       indices.append( index )
> >> ...   return indices
>
>So, this runs through the loop once (the "for index in range(len(r)):" 
>loop), and...  That's all.  So it will be O(n), and therefore should be 
>the fastest of all.  (It might not be, but that depends on some other 
>factors, which I could list out for you if you want.)
>
>>Thanks again.  As I'm new to programming, I thought that shorter codes 
>>run faster than longer ones (though could be less easier to read), but 
>>the two versions of your function alerted me that it's not necessarily true.
>
>I hope I didn't confuse you too much, with all this theoretical stuff ;)
>Figuring out how quickly a program will run is a very tricky business, 
>which is why you have to time stuff before and after any change if you're 
>trying to improve performance.  (I've spent 12 years writing programs 
>professionally, (and probably another 5 before that writing them in 
>school,) so I've developed a sense of what will run fast, and what 
>won't.  You will too, over time.  And in the meantime, using Big-O 
>notation can give you a hint as to what might run faster.
>
>O(n^2 + n) is slower than O(2n) is slower than O(n).
>
>Later,
>Blake.
>_______________________________________________
>Tutor maillist  -  Tutor at python.org
>http://mail.python.org/mailman/listinfo/tutor