[Python-ideas] An alternate approach to async IO
sturla at molden.no
Wed Nov 28 00:50:18 CET 2012
Den 27. nov. 2012 kl. 18:49 skrev Trent Nelson <trent at snakebite.org>:
> Here's the "idea" I had, with zero working code to back it up:
> what if we had a bunch of threads in the background whose sole
> purpose it was to handle AIO? On Windows/AIX, they would poll
> GetQueuedCompletionStatus, on Solaris, get_event().
> They're literally raw pthreads and have absolutely nothing to
> do with Python's threading.Thread() stuff. They exist solely
> in C and can't be interfaced to directly from Python code.
> ....which means they're free to run outside the GIL, and thus,
> multiple cores could be leveraged concurrently. (Only for
> processing completed I/O, but hey, it's better than nothing.)
And herein lies the misunderstanding.
A Python thread can do the same processing of completed I/O before it reacquires the GIL – and thus Python can run on multiple cores concurrently. There is no difference between a pthread and a threading.Thread that has released the GIL. You don't need to spawn a pthread to process data independent if the GIL. You just need to process the data before the GIL is reacquired.
In fact, I use Python threads for parallel computing all the time. They scale as well as OpenMP threads on multiple cores. Why? Because I have made sure the computational kernels (e.g. LAPACK functions) releases the GIL before they execute – and the GIL is not reacquired before they are done. As long as the threads are running in C or Fortran land they don't need the GIL. I don't need to spawn pthreads or use OpenMP pragmas to create threads that can run freely on all cores. Python threads (threading.Thread) can do that too.
More information about the Python-ideas