[Web-SIG] Iterators, generators and threads.
Alan Kennedy
py-web-sig at xhaus.com
Fri Sep 3 14:07:12 CEST 2004
Dear Sig,
With the focus on iterables in WSGI, I think we may need to put
something into the WSGI spec about generators and threading.
As I'm sure you're all aware, generators are an excellent mechanism for
generating content on demand: a perfect fit for memory efficient WSGI
"pull" processing and for event driven servers.
However, generator-iterators are different from other iterables, in that
they cannot be resumed/iterated simultaneously from multiple threads
(without external locking anyway).
Pep 255 is specific on the topic: "Restriction: A generator cannot be
resumed while it is actively running". Which effectively means that a
generator cannot be used from multiple threads without some form of
external synchronization/locking.
Offhand, I can't think of scenarios where a WSGI server or application
would *need* to iterate over an iterable across multiple threads. But I
can certainly think of multiple server architectures where the request
and its related response will pass through multiple threads before
completion. Whether or not it would make sense for such architectures to
iterate an iterable from multiple threads: well, I don't know: is it
possible some server designer might attempt something like this?
Which would probably work as long as the iterable is not a generator.
But if it is: *boom*, the generator could be resumed simultaneously from
multiple threads, thus resulting in a ValueError.
Perhaps we need to describe this problem in the PEP? Or are python
programmers suppoed to be big and old enough to know these things?
I find myself wondering: is this a cpython specific thing? Does resuming
a generator from multiple threads have any meaning?
Obviously, calling a standard function/method from different threads
works because each thread gets an independent stack frame, i.e. local
variables, etc. So if there is no (unsynchronized) shared state between
the threads, everything will work fine.
Since a generator is a single resumable stack frame, resuming it
multiple times simultaneously from multiple threads won't work, from an
isolation point-of-view.
Or am I mis-understanding it? Is the restriction somehow related to the
cpython's GIL?
Obviously, resuming general iterators from multiple threads is related.
Pep 234 makes no statements about threads (well, one unrelated reference
to modifying dictionaries while they are being iterated). So I take this
to mean that iterating iterables from multiple threads is acceptable.
Regards,
Alan.
P.S. I hope Phillip is OK. He said yesterday that he was right in the
Frances path, although obviously that path will have a significant
margin for error. But Frances is *huge*: see this stunning picture from
NASA.
http://antwrp.gsfc.nasa.gov/apod/ap040903.html
More information about the Web-SIG
mailing list