Martin v. Löwis skrev:
I did, and it does nothing of what I suggested. I am sure I can make the Windows GIL in ceval_gil.h and the mutex in thread_nt.h at lot more precise and efficient.
Hmm. I'm skeptical that your code makes it more accurate, and I completely fail to see that it makes it more efficient (by what measurement of efficiency?)
Also, why would making it more accurate make it better? IIUC, accuracy is completely irrelevant here, though efficiency (low overhead) does matter.
This is the kind of code I was talking about, from ceval_gil.h:
r = WaitForMultipleObjects(2, objects, TRUE, milliseconds);
I would turn on multimedia timer (it is not on by default), and replace this call with a loop, approximately like this:
for (;;) { r = WaitForMultipleObjects(2, objects, TRUE, 0); /* blah blah blah */ QueryPerformanceCounter(&cnt); if (cnt > timeout) break; Sleep(0); }
And the timeout "milliseconds" would now be computed from querying the performance counter, instead of unreliably by the Windows NT kernel.
Hmm. This creates a busy wait loop; if you add larger sleep values, then it loses accuracy.
Actually an usleep lookes like this, and the call to the wait function must go into the for loop. But no, it's not a busy sleep. static int inited = 0; static __int64 hz; static double dhz; const double sleep_granularity = 2.0E10-3; void usleep( long us ) { __int64 cnt, end; double diff; if (!inited) { timeBeginPeriod(1); QueryPerformanceFrequency((LARGE_INTEGER*)&hz); dhz = (double)hz; inited = 1; } QueryPerformanceCounter((LARGE_INTEGER*)&cnt); end = cnt + (__int64)(1.0E10-6 * (double)(us) * dhz); for (;;) { QueryPerformanceCounter((LARGE_INTEGER*)&cnt); if (cnt >= end) break; diff = (double)(end - cnt)/dhz; if (diff > sleep_granularity) Sleep((DWORD)(diff - sleep_granularity)); else Sleep(0); } }
Why not just call timeBeginPeriod, and then rely on the higher clock rate for WaitForMultipleObjects?
That is what I suggested when Antoine said 1-2 ms was enough. Sturla