is there better 32 clock() timing?

Stephen Kellett snail at objmedia.demon.co.uk
Wed Jan 26 08:59:24 EST 2005


In message <41f746c3.1846865602 at news.oz.net>, Bengt Richter
<bokr at oz.net> writes
>>QueryPerformanceCounter is 47 times slower to call than clock() on my
>>1Ghz Athlon.
>That really makes me wonder. Perhaps the Athlon handles RDTSC by way of
>an illegal instruction trap and faking the pentium instruction?

No. Athlon implements it correctly- if it didn't you'd end up in the
debugger with an illegal instruction trap - you don't. Also my stats
below show that Athlon's RDTSC is faster than Pentium's which I doubt
you'd get if you were faking things. Taking it further - the test to see
if a processor supports RDTSC is to wrap it in an exception handler and
execute the instruction - if it doesn't you end up in the __except part
of the SEH handler.

QueryPerformanceCounter and RDTSC are not the same thing.
QueryPerformanceCounter talks to hardware to get its results. I imagine
that differences in performance for QueryPerformanceCounter are down to
how the HAL talks to the hardware and can't be blamed on the processor
or manufacturer.

clock() gets its info from the OS at (I imagine) the same granularity as
the NT scheduler. Some systems schedule at 10ms/11ms others at about 6ms
or 7ms. I think this is to do with single/dual processors - unsure as I
don't have a dual processor box. If you call GetThreadTimes() you will
find values returned that match the approx clock() values - which is why
I think they are related.

I've just run some tests using the same old program. QPC is
QueryPerformanceCounter. QPF is QueryPerformanceFrequency. I've included
the QPC/QPF column to show the timings in seconds.

1Ghz Athlon, Windows XP, SP2 1,000,000 iterations
                                QPC             QPC/QPF (seconds)
QueryPerformanceCounter 7156984 5.998233
GetThreadTimes            503277        0.421794
RDTSC                     103430        0.086684
clock()                           148909        0.124800

                                QPC             QPC/QPF (seconds)
850Mhz Pentium III, W2K. 1,000,000 iterations
QueryPerformanceCounter 5652161 1.579017
GetThreadTimes          3608976 1.008222
RDTSC                     842950        0.235491
clock()                           699840        0.195511

The results surprise me - Pentium III clock() takes less time to execute
than Pentium III RDTSC!

It surprises me that the 850Mhz Pentium III QPC is faster than the 1Ghz
Athlon QPC, but whichever way you slice it, QPC is massively slower than
RDTSC or clock(). Also surprising is the W2K GetThreadTimes is so slow
compared to the Athlon GetThreadTimes().

>of the timer chip that drives the old 55ms clock that came from IBM
>using cheap TV crystal based oscillators instead of defining an
>OS-implementer-friendly time base, I think. The frequency was nominally
>1193182 hz I believe. Obviously the OS didn't get interrupted that often,
>but if you divide by 2**16, you get the traditional OS tick of ~55ms:

I though that was the Windows 9x way of doing things. You get the 49 day
wrap around with this one I think.

>you can't expect to control ignition of a racing engine reliably with
>an ordinary windows based program ;-)

...and Schumacher is in the lead, oh look! The Ferrari has blue
screened. The new regulations to reduce speeds in F1 are working, that
has really slowed him down...

Stephen
-- 
Stephen Kellett
Object Media Limited    http://www.objmedia.demon.co.uk
RSI Information:        http://www.objmedia.demon.co.uk/rsi.html



More information about the Python-list mailing list