[Python-ideas] Thread-safe generators
Nick Coghlan
ncoghlan at gmail.com
Sat Apr 15 05:45:16 EDT 2017
On 15 April 2017 at 02:47, Serhiy Storchaka <storchaka at gmail.com> wrote:
> When use a generator from different threads you can get a ValueError
> "generator already executing". Getting this exception with the single thread
> is a programming error, it in case of different threads it could be possible
> to wait until other thread finish executing the generator. The generator can
> be made thread-safe after wrapping it in a class that acquire a lock before
> calling the generator's __next__ method (for example see [1]). But this is
> not very efficient of course.
>
> I wondering if it is worth to add support of thread-safe generators in the
> stdlib. Either by providing standard decorator (written in C for
> efficiency), or adding threading support just in the generator object. The
> latter may need increasing the size of the generator object for a lock and
> thread identifier (but may be GIL is enough), but should not affect
> performance since locking is used only when you faced with a generator
> running in other thread.
Allowing multiple worker threads to pull from the same work queue is a
general concurrency problem, and that's why we have queue.Queue in the
standard library: https://docs.python.org/3/library/queue.html
So I'd be opposed to trying to make generator objects natively thread
aware - as Stephen notes, the GIL is an implementation detail of
CPython, so it isn't OK to rely on it when defining changes to
language level semantics (in this case, whether or not it's OK to have
multiple threads all calling the same generator without some form of
external locking).
However, it may make sense to explore possible options for offering a
queue.AutoQueue type, where the queue always has a defined maximum
size (defaulting to 1), disallows explicit calls to put(), and
automatically populates itself based on an iterator supplied to the
constructors. Once the input iterator raises StopIteration, then the
queue will start reporting itself as being empty.
The benefit of going down that path is that it can be used with
arbitrary iterators (not just generators), and can be more easily
generalised to other synchronisation models (such as multiprocessing).
Cheers,
Nick.
--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
More information about the Python-ideas
mailing list