[Python-ideas] Simpler thread synchronization using "Sticky Condition"

Nathaniel Smith njs at pobox.com
Tue Mar 26 12:24:40 EDT 2019


These kinds of low-level synchronization primitives are notoriously
tricky, yeah, and I'm all in favor of having better higher-level
tools. But I'm not sure that AutoResetEvent adds enough to be worth
it.

AFAICT, you can get this behavior with an Event just fine – using your
pseudocode:

def sender():
     while alive():
           wait_for_my_data_from_hardware()
           send_data_to_receiver()
           auto_event.set()

def receiver():
     while alive():
           auto_event.wait()
           auto_event.clear()   # <-- this line added
           receive_all_data_from_sender()
           process_data()

It's true that if we use a regular Event then the .clear() doesn't
happen atomically with the wakeup, but that doesn't matter. If we call
auto_event.set() and then have new data arrive, then there are two
cases:

1) the new data early enough to be seen by the current call to
receive_all_data_from_sender(): this is fine, the new data will be
processed in this call
2) the new data arrives too late to be seen by the current call to
receive_all_data_from_sender(): that means the new data arrived after
the call to receive_all_data_from_sender() started, which means it
arrived after auto_event.clear(), which means that the call to
auto_event.set() will successfully re-arm the event and another call
to receive_all_data_from_sender() will happen immediately

That said, this is still very tricky. It requires careful analysis,
and it's not very general (for example, if we want to support multiple
receivers than we need to throw out the whole approach and do
something entirely different). In Trio we've actually discussed
removing Event.clear(), since it's so difficult to use correctly:
https://github.com/python-trio/trio/issues/637

You said your original problem is that you have multiple event
sources, and the receiver needs to listen to all of them. And based on
your approach, I guess you only have one receiver, and that it's OK to
couple all the event sources directly to this receiver (i.e., you're
OK with passing them all a Condition object to use).

Under these circumstances, wouldn't it make more sense to use a single
Queue, pass it to all the sources, and have them each do
queue.put((source_id, event))? That's simple to implement, hard to
mess up, and can easily be extended to multiple receivers.

If you want to further decouple the sources from the receiver, then
one approach would be to have each source expose its own Queue
independently, and then define some kind of 'select' operation (like
in Golang/CSP/concurrent ML) to let the receiver read from multiple
Queues simultaneously. This is non-trivial to do, but in return you
get a very general and powerful construct. There's some more links and
discussion here: https://github.com/python-trio/trio/issues/242

> Regarding the link you sent, I don't entirely agree with the opinion expressed: if you try to use a Semaphore for this purpose you will soon find that it is "the wrong way round", it is intended to protect resources from multiple accesses, not to synchronize those multiple accesses

Semaphores are extremely generic primitives – there are a lot of
different ways to use them. I think the blog post is correct that an
AutoResetEvent is equivalent to a semaphore whose value is clamped so
that it can't exceed 1. Your 'auto_event.set()' would be implemented
as 'sem.release()', and 'auto_event.wait()' would be 'sem.acquire()'.

I guess technically the semantics might be slightly different when
there are multiple waiters: the semaphore wakes up exactly one waiter,
while I'm not sure what your AutoResetEvent would do. But I can't see
any way to use AutoResetEvent reliably with multiple waiters anyway.


-n

--
Nathaniel J. Smith -- https://vorpus.org


More information about the Python-ideas mailing list