[Tutor] Astonishing timing result

Wed Jun 25 23:35:16 CEST 2008

On Wed, 2008-06-25 at 15:53 -0400, Kent Johnson wrote:
> On Wed, Jun 25, 2008 at 2:06 PM, Lie Ryan <lie.1296 at gmail.com> wrote:
> > On Wed, 2008-06-25 at 12:56 -0400, Kent Johnson wrote:
> >> On Wed, Jun 25, 2008 at 12:05 PM, Lie Ryan <lie.1296 at gmail.com> wrote:
> >>
> >> >    t_a = min(t_A, t_a)
> >> >    t_b = min(t_A, t_b)
> >> >    t_c = min(t_A, t_c)
> >> >    t_d = min(t_A, t_d)
> >>
> >> What is this for? It should at least be t_B, t_C, t_D.
> >
> > A common pitfall in benchmarking is averaging the benchmark result. That
> > is WRONG, FLAT WRONG.
> 
> Yes, I agree. I missed the outer loop that this is in. But your code
> is still WRONG, FLAT WRONG!
>  t_b = min( *** t_A ***, t_b) // should be t_B, etc.

Ah, yes sorry, a slip of hand when copying the code.

The corrected timing.

Outer loop: 10x
Inner Loop: 5000000x
    per Innerloop   Overall
a | 1.05028605461 | 10.6743688583
b | 2.21457099915 | 22.3394482136
c | 3.53437685966 | 35.6701157093
d | 2.5965359211  | 26.1492891312

Overall Running Time: 94.8337771893

Well, it's obvious that direct-method is the fastest, simply because it
bypasses function calling and module name look up overhead. Method C
(module) is the slowest because the name look up is done twice, the
module's name then the function's name inside the module, then function
calling. But anyway, considering that this overhead of (3.5 - 1 = 2.5
second) is accumulated over 5 000 000 iteration, it is silly to to use
method-a (not using function and method) for reason of speed. The
difference between method a and c is 2.5 second / 5 000 000 = 0.0000005
second = 0.5 microsecond. (DISCLAIMER: Timing is valid on my machine
only)

Sure, at a glance, it seems that the saving is good enough 1:3.5, that's
28.6% a saving of 71.4%, but remember that most functions are much more
complex than this 'a = 1', to put it into perspective:

a = n**2
Outer loop: 10x
Inner Loop: 5000000x
a | 2.1795668602 | 21.9916498661
b | 3.4880130291 | 35.1593179703
c | 4.97427606583 | 50.6705505848
d | 3.84812307358 | 39.1990897655

time: 43%, saving 57%

'a = math.sqrt(n ** 2 + n ** 2)'
'print 1'
Outer loop: 10x
Inner Loop: 50000x
a | 0.805603027344 | 8.24900960922
b | 0.921233177185 | 9.31604623795
c | 1.03809094429 | 10.4301710129
d | 0.956300973892 | 9.58661794662
Total Time:  37.582244873

time: 78%, saving: 22%

'print 1'
Outer loop: 10x
Inner Loop: 50000x
    per Innerloop   Overall
a | 0.573838949203 | 6.04536104202
b | 0.578473091125 | 6.05607891083
c | 0.579005002975 | 6.08867025375
d | 0.570523023605 | 5.93990397453

Negligible.

So unless your function is extremely simple like 'a = 1', there is no
benefit of avoiding function/methods. A single print statement (print is
a very slow function/statement) would immediately nullify the speed
gain. Even an intermediate complexity function would make the saving
useless, and to think about it, I think nobody would make 'a = 1' to be
a function right? 

> >> > ## OUTPUT
> >> > # 1.02956604958
> >> > # 1.02956604958
> >> > # 1.02956604958
> >> > # 1.02956604958
> >>
> >> It's *very easy* to write bogus timing tests, as this thread
> >> demonstrates. Some protections:
> >> - when comparing different implementations of a function, make sure
> >> each implementation returns the correct result by checking the return
> >> value.
> 
> > Actually the timing is all equal because of the timer's resolution. I
> > don't have a high-precision timer on hand.
> 
> Or maybe they are all equal because they are all t_A...
> 
> Kent