GetTickCount vs. clock for msec resolution

Bengt Richter bokr at accessone.com
Wed Apr 25 04:47:33 EDT 2001


On Sat, 21 Apr 2001 03:06:24 GMT, "Dan Maas" <dmaas at nospam.dcine.com>
wrote:
[...]
>On x86 Linux, time.time() reads the Pentium TSC (clock cycle counter), so
>you get "wall-clock" time with a resolution of a few microseconds. (I put
>"wall clock" in quotes because on SMP machines or systems that vary the
>clock speed, you won't get a meaningful measurement...)
>
>For profiling purposes, time.clock() is too coarse unless you run your
>code for a LONG time. Personally I just use time.time(), and make sure
>that Python is the only thing running on my system...
To get accurate minimum times using TSC-based timing, time a code
snippet in a loop, and note how the results cluster: If the snippet
is quite short, it will execute most of the time without being
interrupted (unless of course you're waiting for an interrupt-related
event in the thing you're timing). If you list your results, you'll
get something like:
1) fastest:	shortest path through your code (path lengths vary if
		there's logic, or data-dependent timing such as
		multiply by 0 vs nonzero), with instructions in cache
		because of previous loop.

2)		shortest path, with effect of having to load cache

3)		next shortest path, like 1 or 2 above, etc.

4)		above plus effect of processing an interrupt, usually
		clock, but maybe mouse, etc. varies per driver code

5)		one of above plus a relatively long time, for a
		preemtive context switch to something else. Maybe
		something decides it's time to flush buffers, or
	 	whatnot, or maybe you didn't really shut down all
		the other applications. No matter, by separating
		results, you can ignore these extraneous times.

If you're interested in how long just your code is taking, use results
from 1 or 2, and throw out the rest. The other numbers can be
interesting too, though. It is interesting to time a do-nothing loop,
to see how long processing various interrupts takes, and when they
happen.

BTW, when I was playing with TSC, the counter ran at the full clock
rate of the processor, so even on a p90, the resolution was 1/90th
microsecond. I'm not sure whether the current crop of really fast
processors count clock cycles too. I got the impression my PII300 did,
but I actually didn't compare to real time.

Timing for short snippets running in the cache should be very
consistent and accurate.



More information about the Python-list mailing list