[Python-ideas] An alternate approach to async IO

Wed Nov 28 00:50:18 CET 2012

Den 27. nov. 2012 kl. 18:49 skrev Trent Nelson <trent at snakebite.org>:

>     
>    Here's the "idea" I had, with zero working code to back it up:
>    what if we had a bunch of threads in the background whose sole
>    purpose it was to handle AIO?  On Windows/AIX, they would poll
>    GetQueuedCompletionStatus, on Solaris, get_event().
> 
>    They're literally raw pthreads and have absolutely nothing to
>    do with Python's threading.Thread() stuff.  They exist solely
>    in C and can't be interfaced to directly from Python code.
> 
>    ....which means they're free to run outside the GIL, and thus,
>    multiple cores could be leveraged concurrently.  (Only for
>    processing completed I/O, but hey, it's better than nothing.)

And herein lies the misunderstanding. 

A Python thread can do the same processing of completed I/O before it reacquires the GIL – and thus Python can run on multiple cores concurrently. There is no difference between a pthread and a threading.Thread that has released the GIL. You don't need to spawn a pthread to process data independent if the GIL. You just need to process the data before the GIL is reacquired.

In fact, I use Python threads for parallel computing all the time. They scale as well as OpenMP threads on multiple cores. Why? Because I have made sure the computational kernels (e.g. LAPACK functions) releases the GIL before they execute – and the GIL is not reacquired before they are done. As long as the threads are running in C or Fortran land they don't need the GIL. I don't need to spawn pthreads or use OpenMP pragmas to create threads that can run freely on all cores. Python threads (threading.Thread) can do that too.

Sturla