[Tutor] Efficiency?

Thu Jul 29 15:28:20 CEST 2004

Hee-Seng Kye wrote:
> Hi, I was reading you e-mail again the other day, and I realized that I 
> wanted to ask you about efficiency of function.

Sure.  (I've renamed the functions so that I can talk about them a 
little easier.)  (And I think I'm going to Cc: the tutor list, since 
re-reading my email, I'm touching on some points that other people might 
find useful...  And hey, maybe the other tutors will time some of these 
functions, and let us know which actually is faster, and why one might 
be faster than another.)

> You said the function below is much less efficient though shorter, and I 
> discovered that the one below is significantly slower, as you said.  Why 
> is that?  I thought list comprehension is usually faster than 'for' 
> loops.  Is it because the one below uses 'reduce' function?  The 
> function above is even appending to an empty list, which I thought slow 
> things down, and yet much faster.

Well, we can always rewrite the list comprehension as a for loop, which
will show us where the inefficiency lies.

So this:
 >> >>> def findSmallestNumber1(r):
 >> ...   return [i for i in range(len(r)) if r[i] == reduce( min, r ) ]

turns into:
 >>> def findSmallestNumber2(r):
...   retval = []
...   for i in range( len( r ) ):
...     if r[i] == reduce( min, r ):
...       retval.append(i)
...   return retval

The for-loop version of this will be a little slower than the 
list-comprehension version, but they're both doing much more work than 
the first version I sent you.  How are they doing more work, you ask? 
Well, the reduce function has to go through every element of the list 
every time it's called to find the minimum.  ("reduce(min,r)" basically 
finds the smallest element in the list "r".)

So the code in findSmallestNumber1 and 2 goes through the whole list 
once (the "for i in range( len( r ) ):" loop), and then "n" more times 
(the "reduce( min, r )" call), where "n" is the number of elements in 
the loop.  If we think of going through the loop once as "n", then this 
code will go through it "n + n*n" (or "n + n^2", or "n^2 + n") times, 
which I'll write as O(n^2 + n).  (Do you know Big-O Notation?  Maybe you 
call it "The Order of a function".  It's a way of describing how fast 
your code will run, given large inputs.  I'll show you another example 
below.  If you're impatient, googling for "Big-O Notation" gives you a 
lot of hits, all of which seem to be on-topic.)

Now, calling reduce runs through the list at the speed of C, which is 
faster than doing it at the speed of Python, but it's still going to be 
quite slow if your loop is large.  And even worse, the minimum number 
isn't going to change between one run and the next, so we should really 
only calculate it once, and store it in a variable, which we can call, 
say "s".

And that's essentially what this code does:
 >> >>> def findSmallestNumber3(r):
 >> ...   s = reduce( min, r )
 >> ...   return [i for i in range(len(r)) if r[i] == s ]

or its for-loop version:
 >>> def findSmallestNumber3(r):
...   retval = ""
...   s = reduce( min, r )
...   for i in range(len(r)):
...     if r[i] == s:
...       retval.append(i)
...   return retval

This will run through the loop once (the "reduce( min, r )" call), and 
then once more (the "for i in range(len(r)):" loop), for a total of 2 
times, which I'm going to write as O(2n).

> When you say the one below is a little more efficient than the above, do 
> you mean that the one below is more clear to read?  Or do you mean the 
> one below runs a little faster?  I don't understand how the one above is 
> different from the one below, except for a matter of notation.

It should run a little faster.  Perhaps not a lot faster, perhaps a lot 
faster.  Intuitively, I think it would depend on the length of the list. 
  If you time it, let me know.

And do you understand why it should run faster?

And let's go back to the first example:
 >> >>> def findSmallestNumber0( r ):
 >> ...   indices = []
 >> ...   for index in range(len(r)):
 >> ...     if len(indices) == 0 or r[index] < r[ indices[0] ]:
 >> ...       indices = [index]
 >> ...     elif r[index] == r[ indices[0] ]:
 >> ...       indices.append( index )
 >> ...   return indices

So, this runs through the loop once (the "for index in range(len(r)):" 
loop), and...  That's all.  So it will be O(n), and therefore should be 
the fastest of all.  (It might not be, but that depends on some other 
factors, which I could list out for you if you want.)

> Thanks again.  As I'm new to programming, I thought that shorter codes 
> run faster than longer ones (though could be less easier to read), but 
> the two versions of your function alerted me that it's not necessarily 
> true.

I hope I didn't confuse you too much, with all this theoretical stuff ;)
Figuring out how quickly a program will run is a very tricky business, 
which is why you have to time stuff before and after any change if 
you're trying to improve performance.  (I've spent 12 years writing 
programs professionally, (and probably another 5 before that writing 
them in school,) so I've developed a sense of what will run fast, and 
what won't.  You will too, over time.  And in the meantime, using Big-O 
notation can give you a hint as to what might run faster.

O(n^2 + n) is slower than O(2n) is slower than O(n).

Later,
Blake.