<div dir="ltr">This is a bad idea in the generator itself, as commented earlier by others here.<div><br></div><div>From a cross implementation perspective, in Jython, different threads can call next on a non running generator, <i>so long as they coordinate with each other external to any use of this generator</i>, and this works fine.</div><div><br></div><div>But any reliance on gi_running, as seen here, can only be considered to be possible help in detecting such races; it would not even come close to preventing a race:<br><div><a href="https://github.com/jythontools/jython/blob/master/src/org/python/core/PyGenerator.java#L146">https://github.com/jythontools/jython/blob/master/src/org/python/core/PyGenerator.java#L146</a><br></div></div><div><br></div><div>(We don't even bother making gi_running a volatile to get actual test-and-set style semantics, because really it makes no sense to pretend otherwise; and why pay the performance penalty?)</div><div><br></div><div>The idea of putting generators behind a queue sounds reasonably workable - the semantics then are the right ones, although implementing this efficiently is the trick here.<br><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Apr 16, 2017 at 11:08 PM, Nick Coghlan <span dir="ltr"><<a href="mailto:ncoghlan@gmail.com" target="_blank">ncoghlan@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 17 April 2017 at 08:00, Paul Moore <<a href="mailto:p.f.moore@gmail.com">p.f.moore@gmail.com</a>> wrote:<br>

> On 15 April 2017 at 10:45, Nick Coghlan <<a href="mailto:ncoghlan@gmail.com">ncoghlan@gmail.com</a>> wrote:<br>

>> So I'd be opposed to trying to make generator objects natively thread<br>

>> aware - as Stephen notes, the GIL is an implementation detail of<br>

>> CPython, so it isn't OK to rely on it when defining changes to<br>

>> language level semantics (in this case, whether or not it's OK to have<br>

>> multiple threads all calling the same generator without some form of<br>

>> external locking).<br>

>><br>

>> However, it may make sense to explore possible options for offering a<br>

>> queue.AutoQueue type, where the queue always has a defined maximum<br>

>> size (defaulting to 1), disallows explicit calls to put(), and<br>

>> automatically populates itself based on an iterator supplied to the<br>

>> constructors. Once the input iterator raises StopIteration, then the<br>

>> queue will start reporting itself as being empty.<br>

><br>

> +1 A generator that can have values pulled from it on different<br>

> threads sounds like a queue to me, so the AutoQueue class that wraps a<br>

> generator seems like a natural abstraction to work with. It also means<br>

> that the cost for thread safety is only paid by those applications<br>

> that need it.<br>

<br>

</span>If someone did build something like this, it would be interesting to<br>

benchmark it against a more traditional producer thread model, where<br>

one thread is responsible for adding work items to the queue, while<br>

others are responsible for draining them.<br>

<br>

The trick is that an auto-queue would borrow execution time from the<br>

consumer threads when new values are needed, so you'd theoretically<br>

get fewer context switches between threads, but at the cost of<br>

changing the nature of the workload in a given thread, and hence<br>

messing with the working set of objects it has active.<br>

<br>

It may also pair well with the concurrent.futures.Executor model,<br>

which is already good for "go handle this predefined list of tasks",<br>

but currently less useful as a replacement for a message queue with a<br>

pool of workers.<br>

<br>

Setting the latter up yourself is currently still a bit tedious, since:<br>

<br>

1. we don't have a standard threading Pool abstraction in the standard<br>

library, just the one tucked away as part of multiprocessing<br>

2. while queue.Queue has native support for worker pools, we don't<br>

provide a pre-assembled version that makes it easy to say "here is the<br>

producer, here are the consumers, wire them together for me"<br>

<br>

There are good reasons for that (mainly that it's hard to come up with<br>

an abstraction that's useful in its own right without becoming so<br>

complex that you're on the verge of reinventing a task manager like<br>

celery or a distributed computation manager like dask), but at the<br>

same time, the notion of "input queue, worker pool, output queue" is<br>

one that comes up a *lot* across different concurrency models, so<br>

there's potential value in providing a low-barrier-to-entry<br>

introduction to that idiom as part of the standard library.<br>

<span class="im HOEnZb"><br>

Cheers,<br>

Nick.<br>

<br>

--<br>

Nick Coghlan   |   <a href="mailto:ncoghlan@gmail.com">ncoghlan@gmail.com</a>   |   Brisbane, Australia<br>

</span><div class="HOEnZb"><div class="h5">______________________________<wbr>_________________<br>

Python-ideas mailing list<br>

<a href="mailto:Python-ideas@python.org">Python-ideas@python.org</a><br>

<a href="https://mail.python.org/mailman/listinfo/python-ideas" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/python-ideas</a><br>

Code of Conduct: <a href="http://python.org/psf/codeofconduct/" rel="noreferrer" target="_blank">http://python.org/psf/<wbr>codeofconduct/</a><br>

</div></div></blockquote></div><br></div>