(Sorry about cutting context, I'll try not to do that again, but I also try to avoid reposting an entire email.)
It has synchronisation which is _aware_ of threads, but it never creates, requires or uses them. It simply ensures thread-safe reentrancy, which will be required for any general solution unless it is completely banned from interacting across CPU threads.
I don't see it that way. Any time you acquire a lock, you may be blocked for a long time. In a typical event loop that's an absolute no-no. Typically, to wait for another thread, you give the other thread a callback that adds a new event for *this* thread.
Agreed, but when you're waiting for another thread to stop reading its queue so you can add to it, how are you supposed to queue an event while you wait? The lock in Future is only an issue in result() where we wait for another thread to complete the event, but that is the entire point of that function. FWIW I don't see any scheduler ever calling result(), but there are valid situations for a user to call it (REPL, already on a worker thread, unit tests). Everywhere else the lock is required for thread safety. It could be a different lock to the one in result, but I don't think anything is gained from that. Rewriting Future in C and using CPU CAS primitives might be possible, but probably only of limited value.
Now, it's possible that in Windows, when using IOCP, the philosophy is different -- I think I've read in http://msdn.microsoft.com/en-us/library/aa365198%28VS.85%29.aspx that there can be multiple threads reading events from a single queue. But AFAIK, in Twisted and Tornado and similar systems, and probably even in gevent and Stackless, there is a strong culture around having only a single thread handling events (at least only one thread at a time), since the assumption is that as long as you don't suspend, you can trust that the world doesn't change, and that assumption becomes invalid when other threads may also be handling events from the same queue.
This is true, and my understanding is that IOCP is basically just a thread pool, and the 'single queue' means that all the threads are waiting on all the events and you can't guarantee which thread will get which. This is better than creating a new thread for each file, but I think that's all it is meant to be. We can easily write a single thread that can wait on all I/O, scheduling callbacks on the main thread, if necessary. I'm pretty sure that all platforms have better ways to do this though, but because they're all different it will need different implementations.
It's possible to design a world where different threads have their own event queues, and this assumption would only be valid for events belonging to the same queue; however that seems complicated. And you still don't want to ever attempt to acquire a *threading* lock, because you end up blocking the entire event loop.
Multiple threads with independent queues should be okay, though definitely an advanced scenario. I'm sure this would be preferable to having multiple processes with one thread/queue each in some cases. In any case, this is easy enough to implement with TLS.
(*I'm inclined to define [the required Future interface] as 'result()', 'done()', 'add_done_callback()', 'exception()', 'set_result()' and 'set_exception()' functions. Maybe more, but I think that's sufficient. The current '_waiters' list is an optimisation for add_done_callback(), and doesn't need to be part of the interface.)
Agreed. I don't see much use for the cancellation stuff and all the extra complexity that adds to the interface. BTW, I think concurrent.futures.Future doesn't stop you from calling set_result() or set_exception() more than once, which I think is a mistake -- I do enforce that in NDB's Futures.
I agree, there should be no way to set the result or exception more than once. On cancellation, while there is some complexity involved I do think we can make use of 'cancel' and 'cancelled' functions to pass a signal back into the worker: op = do_something_async() # not yielded button.on_click += lambda: op.cancel() try: result = yield op except CancelledError: return False def do_something_async(): f = Future() def threadproc(): total = 0 for i in range(10000): if f.cancelled(): raise CancelledError total += i f.set_result(total) Thread(target=threadproc).run() return f I certainly would not want to see the CancelledError be raised automatically - this is no thread.abort() call - but it may be convenient to have an interface for "self._cancelled = True" and "return self._cancelled" that at least saves people from coming up with their own way of passing it in. The worker may completely ignore it, or complete anyway, but for long running operations it may be very handy. (I'll stop before I start thinking about partial results... :) )
[Here you snipped some context. You proposed having public APIs that use "yield <future>" and leaving "yield from <generator>" as something the user can use in her own program. To which I replied:]
Hm. I think it'll be confusing.
I think the basic case ("just make it work") will be simpler, and the advanced case ("minimise memory/CPU usage") will be more complicated.
Let's agree to disagree on this. I think they are both valid design choices with different trade-offs. We should explore both directions further so as to form a better opinion.
Probably we need some working implementations to code against.
And the Futures-only-in-public-APIs rule seems to encourage less efficient solutions.
Personally, I'd prefer developers to get a correct solution without having to understand how the whole thing works (the "pit of success"). I'm also sceptical of any other rule being as portable and composable - I don't think a standard library should have APIs where "you must only call this function with yield-from". ('await' in C# is not compulsory - you can take the Task returned from an async method and do whatever you like with it.)
Surely "whatever you like" is constrained by whatever the Task type defines. Maybe it looks like a Future and has a blocking method to wait for the result, like .result() on concurrent.futures.Future? If you want that functionality for generators you just have to call some function, passing it the generator as an argument. Remember, Python doesn't consider that an inferior choice of API design compared to making something a method of the object itself -- witness len(), repr() and many others.
I'm interested that you skipped my "portable and composable" claim and went straight for my aside about another language. I'd prefer to avoid introducing top-level names, especially since this is an API with plenty of predecessors... what sort of trouble would we be having if sched or asyncore had claimed 'wait()'? Even more so because it's Python, since it is so easy to overwrite the value. (And as it happens, Task handles both the asynchrony and the callbacks, so it looks a bit like Thread and Future mixed together. Personally, I prefer to keep the concepts separate.)
FWIW, if I may sound antagonistic, I actually think that we're mostly in violent agreement, and I think we're getting closer to coming up with a sensible set of requirements and possibly even an API proposal. Keep it coming!
I do my best work when someone is arguing with me :) Cheers, Steve