[Tutor] could somebody please explain...

Steven D'Aprano steve at pearwood.info
Fri Oct 3 13:41:26 CEST 2014


On Wed, Oct 01, 2014 at 09:43:29AM -0700, Clayton Kirkwood wrote:

> In an effort to learn and teach, I present a simple program which measures
> the time it takes to the various add functions with the appending results:

Well done for making the effort! Now I'm going to tell you all the 
things you've done wrong! Sorry.

But seriously, I am very pleased to see you making the effort to develop 
this on your own, but *accurately* timing fast-running code on modern 
computers is very tricky.

The problem is, when you run some code, it isn't the only program 
running! The operating system is running, and these days all computers 
are multi-tasking, which means that anything up to hundreds of other 
programs could be running at the same time. At any one instant, 
most of them will be idle, doing nothing, but there's no way to be 
sure.

Furthermore, there are now complexities with CPU caches. Running a bit 
of code will be much slower the first time, since it is not in the CPU 
cache. If the code it too big, it won't fit in the cache. 

The end result is that when you time how long a piece of code takes to 
run, there will always be two components:

- the actually time taken for your code to run;

- random "noise" caused by CPU cache effects, other processes running, 
the operating system, your anti-virus suddenly starting a scan in the 
middle of the run, etc.

The noise can be quite considerable, possibly a few seconds. Now 
obviously if your code took ten minutes to run, then a few seconds 
either way is no big deal. But imagine that your timing test 
says that it took 2 seconds. That could mean:

- 0.001 seconds for your code, and 1.999 seconds worth of noise;

- 1.999 seconds for your code, and 0.001 seconds worth of noise;

- or anything in between.

That measurement is clearly quite useless.

Does this mean that timing Python code is impossible? No, not really, 
but you have to do it carefully. The best way is to use Python's 
"timeit" module, which is carefully crafted to be as accurate as 
possible. First I'll show some results with timeit, then come back for a 
second post where I explain what you can do to be (nearly) as accurate.

I'm going to compare four different ways of adding two numbers:

(1) Using the + operator

(2) Using operator.add

(3) Using operator.__add__

(4) Using a hand-written function, made with lambda


Here's the plus operator: from the command shell, I tell Python to use 
the timeit module to time some code. I give it some setup code to 
initialise two variables, then I time adding them together:

[steve at ando ~]$ python3.3 -m timeit -s "x = 1; y = 2" "x + y"
10000000 loops, best of 3: 0.0971 usec per loop
[steve at ando ~]$ python3.3 -m timeit -s "x = 1; y = 2" "x + y"
10000000 loops, best of 3: 0.0963 usec per loop


So timeit measures how long it takes to run "x + y" ten million times. 
It does that three times, and picks the fastest of the three. The 
fastest will have the least amount of noise. I ran it twice, and the two 
results are fairly close: 0.0971 microseconds, and 0.0963 microseconds.


[steve at ando ~]$ python3.3 -m timeit -s "x = 1; y = 2" -s "import operator" "operator.add(x, y)"
1000000 loops, best of 3: 0.369 usec per loop
[steve at ando ~]$ python3.3 -m timeit -s "x = 1; y = 2" -s "import operator" "operator.add(x, y)"
1000000 loops, best of 3: 0.317 usec per loop


This time I use operator.add, and get a speed of about 0.3 microseconds. 
So operator.add is about three times slower than the + operator.


[steve at ando ~]$ python3.3 -m timeit -s "x = 1; y = 2" -s "import operator" "operator.__add__(x, y)"
1000000 loops, best of 3: 0.296 usec per loop
[steve at ando ~]$ python3.3 -m timeit -s "x = 1; y = 2" -s "import operator" "operator.__add__(x, y)"
1000000 loops, best of 3: 0.383 usec per loop

This time I use operator.__add__, and get about the same result as 
operator.add. You can see the variability in the results: 0.296 to 0.383 
microsecond, that's a variation of about 30%.


[steve at ando ~]$ python3.3 -m timeit -s "x = 1; y = 2" -s "add = lambda a,b: a+b" "add(x, y)"
1000000 loops, best of 3: 0.296 usec per loop
[steve at ando ~]$ python3.3 -m timeit -s "x = 1; y = 2" -s "add = lambda a,b: a+b" "add(x, y)"
1000000 loops, best of 3: 0.325 usec per loop

Finally, I try it with a hand-made function using lambda, and I get 
about the same 0.3 microseconds again, with considerable variability.

Of course, the results you get on your computer may be completely 
different.



More to follow...




-- 
Steven


More information about the Tutor mailing list