[Python-ideas] An alternate approach to async IO

Sturla Molden sturla at molden.no
Wed Nov 28 00:50:18 CET 2012


Den 27. nov. 2012 kl. 18:49 skrev Trent Nelson <trent at snakebite.org>:

>     
>    Here's the "idea" I had, with zero working code to back it up:
>    what if we had a bunch of threads in the background whose sole
>    purpose it was to handle AIO?  On Windows/AIX, they would poll
>    GetQueuedCompletionStatus, on Solaris, get_event().
> 
>    They're literally raw pthreads and have absolutely nothing to
>    do with Python's threading.Thread() stuff.  They exist solely
>    in C and can't be interfaced to directly from Python code.
> 
>    ....which means they're free to run outside the GIL, and thus,
>    multiple cores could be leveraged concurrently.  (Only for
>    processing completed I/O, but hey, it's better than nothing.)


And herein lies the misunderstanding. 

A Python thread can do the same processing of completed I/O before it reacquires the GIL – and thus Python can run on multiple cores concurrently. There is no difference between a pthread and a threading.Thread that has released the GIL. You don't need to spawn a pthread to process data independent if the GIL. You just need to process the data before the GIL is reacquired.

In fact, I use Python threads for parallel computing all the time. They scale as well as OpenMP threads on multiple cores. Why? Because I have made sure the computational kernels (e.g. LAPACK functions) releases the GIL before they execute – and the GIL is not reacquired before they are done. As long as the threads are running in C or Fortran land they don't need the GIL. I don't need to spawn pthreads or use OpenMP pragmas to create threads that can run freely on all cores. Python threads (threading.Thread) can do that too.



Sturla




More information about the Python-ideas mailing list