On 03.11.2012 18:22, Antoine Pitrou wrote:
With IOCP on Windows there is a thread-pool that continuously polls
the i/o tasks
for completion. So I think IOCPs might approach O(n) at some point.
Well, I don't know about the IOCP implementation, but "continuously polling the I/O tasks" sounds like a costly way to do it (what system call would that use?).
The polling uses the system call GetOverlappedResult, and if the task is unfinished, call Sleep(0) to release the time-slice and poll again.
Specifically, if the last argument to GetOverlappedResult is FALSE, and the return value is FALSE, we must call GetLastError to retrieve an error code. If GetLastError returns ERROR_IO_INCOMPLETE, we know that the task was not finished.
A bit more sophisticated: Put all these asynchronous i/o tasks in a fifo queue, and set up a thread-pool that pops tasks off the queue and polls with GetOverlappedResult and GetLastError. A task that is unfinished goes back into the queue. If a task is complete, the thread that popped it off the queue executes a callback. A thread-pool that operates like this will reduce/prevent the excessive number of context shifts in the kernel as multiple threads hammering on Sleep(0) would incur. Then invent a fancy name for this scheme, e.g. call it "I/O Completion Ports".
Then you notice that due to the queue, the latency is proportional to O(n) with n the number of pending i/o tasks in the "I/O Completion Port". To avoid this affecting the latency, you patch your program by setting up multiple "I/O Completion Ports", and reinvent the load balancer to distribute i/o tasks to multiple "ports". With a bit of work, the server will remain responsive and "rather scalable" as long as the server is still i/o bound. At the moment the number of i/o tasks makes the server go CPU bound, which will happen rather soon because of they way IOCPs operate, the computer overheats and goes up in smoke. And that is when the MBA manager starts to curse Windows as well, and finally agrees to use Linux or *BSD/Apple instead ;-)
If the kernel cooperates, no continuous polling should be required.
My main problem with IOCP is that they provide the "wrong" signal. They tell us when I/O is completed. But then the work is already done, and how did we know when to start?
The asynch i/o in select, poll, epoll, kqueue, /dev/poll, etc. do the opposite. They inform us when to start an i/o task, which makes more sense to me at least.
Typically, programs that use IOCP must invent their own means of signalling "i/o ready to start", which might kill any advantage of using IOCPs over simpler means (e.g. blocking i/o).
This by the way makes me wonder what Windows SUA does? It is OpenBSD based. Does it have kqueue or /dev/poll? If so, there must be support for it in ntdll.dll, and we might use those functions instead of pesky IOCPs.