<div dir="ltr">This is a bad idea in the generator itself, as commented earlier by others here.<div><br></div><div>From a cross implementation perspective, in Jython, different threads can call next on a non running generator, <i>so long as they coordinate with each other external to any use of this generator</i>, and this works fine.</div><div><br></div><div>But any reliance on gi_running, as seen here, can only be considered to be possible help in detecting such races; it would not even come close to preventing a race:<br><div><a href="https://github.com/jythontools/jython/blob/master/src/org/python/core/PyGenerator.java#L146">https://github.com/jythontools/jython/blob/master/src/org/python/core/PyGenerator.java#L146</a><br></div></div><div><br></div><div>(We don't even bother making gi_running a volatile to get actual test-and-set style semantics, because really it makes no sense to pretend otherwise; and why pay the performance penalty?)</div><div><br></div><div>The idea of putting generators behind a queue sounds reasonably workable - the semantics then are the right ones, although implementing this efficiently is the trick here.<br><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Apr 16, 2017 at 11:08 PM, Nick Coghlan <span dir="ltr"><<a href="mailto:ncoghlan@gmail.com" target="_blank">ncoghlan@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 17 April 2017 at 08:00, Paul Moore <<a href="mailto:p.f.moore@gmail.com">p.f.moore@gmail.com</a>> wrote:<br>
> On 15 April 2017 at 10:45, Nick Coghlan <<a href="mailto:ncoghlan@gmail.com">ncoghlan@gmail.com</a>> wrote:<br>
>> So I'd be opposed to trying to make generator objects natively thread<br>
>> aware - as Stephen notes, the GIL is an implementation detail of<br>
>> CPython, so it isn't OK to rely on it when defining changes to<br>
>> language level semantics (in this case, whether or not it's OK to have<br>
>> multiple threads all calling the same generator without some form of<br>
>> external locking).<br>
>><br>
>> However, it may make sense to explore possible options for offering a<br>
>> queue.AutoQueue type, where the queue always has a defined maximum<br>
>> size (defaulting to 1), disallows explicit calls to put(), and<br>
>> automatically populates itself based on an iterator supplied to the<br>
>> constructors. Once the input iterator raises StopIteration, then the<br>
>> queue will start reporting itself as being empty.<br>
><br>
> +1 A generator that can have values pulled from it on different<br>
> threads sounds like a queue to me, so the AutoQueue class that wraps a<br>
> generator seems like a natural abstraction to work with. It also means<br>
> that the cost for thread safety is only paid by those applications<br>
> that need it.<br>
<br>
</span>If someone did build something like this, it would be interesting to<br>
benchmark it against a more traditional producer thread model, where<br>
one thread is responsible for adding work items to the queue, while<br>
others are responsible for draining them.<br>
<br>
The trick is that an auto-queue would borrow execution time from the<br>
consumer threads when new values are needed, so you'd theoretically<br>
get fewer context switches between threads, but at the cost of<br>
changing the nature of the workload in a given thread, and hence<br>
messing with the working set of objects it has active.<br>
<br>
It may also pair well with the concurrent.futures.Executor model,<br>
which is already good for "go handle this predefined list of tasks",<br>
but currently less useful as a replacement for a message queue with a<br>
pool of workers.<br>
<br>
Setting the latter up yourself is currently still a bit tedious, since:<br>
<br>
1. we don't have a standard threading Pool abstraction in the standard<br>
library, just the one tucked away as part of multiprocessing<br>
2. while queue.Queue has native support for worker pools, we don't<br>
provide a pre-assembled version that makes it easy to say "here is the<br>
producer, here are the consumers, wire them together for me"<br>
<br>
There are good reasons for that (mainly that it's hard to come up with<br>
an abstraction that's useful in its own right without becoming so<br>
complex that you're on the verge of reinventing a task manager like<br>
celery or a distributed computation manager like dask), but at the<br>
same time, the notion of "input queue, worker pool, output queue" is<br>
one that comes up a *lot* across different concurrency models, so<br>
there's potential value in providing a low-barrier-to-entry<br>
introduction to that idiom as part of the standard library.<br>
<span class="im HOEnZb"><br>
Cheers,<br>
Nick.<br>
<br>
--<br>
Nick Coghlan | <a href="mailto:ncoghlan@gmail.com">ncoghlan@gmail.com</a> | Brisbane, Australia<br>
</span><div class="HOEnZb"><div class="h5">______________________________<wbr>_________________<br>
Python-ideas mailing list<br>
<a href="mailto:Python-ideas@python.org">Python-ideas@python.org</a><br>
<a href="https://mail.python.org/mailman/listinfo/python-ideas" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/python-ideas</a><br>
Code of Conduct: <a href="http://python.org/psf/codeofconduct/" rel="noreferrer" target="_blank">http://python.org/psf/<wbr>codeofconduct/</a><br>
</div></div></blockquote></div><br></div>