On 6 May 2015 at 07:46, Greg Ewing greg.ewing@canterbury.ac.nz wrote:
Another problem with the "core" idea is that you can't start with an event loop that "just does scheduling" and then add on other features such as I/O *from the outside*. There has to be some point at which everything comes together, which means choosing something like select() or poll() or I/O completion queues, and build that into the heart of your event loop. At that point it's no longer something with a simple core.
Looking at asyncio.queues, the only features it needs are:
1. asyncio.events.get_event_loop() 2. asyncio.futures.Future - creating a standalone Future 3. asyncio.locks.Event 4. @coroutine
locks.Event in turn only needs the other 3 items. And you can ignore get_event_loop() as it's only used to get the default loop, you can pass in your own.
And asyncio.futures only uses get_event_loop (and _format_callback) from asyncio.events.
Futures require the loop to support: 1. call_soon 2. call_exception_handler 3. get_debug
So, to some extent (how far is something I'd need to code up a loop to confirm) you can build the Futures and synchronisation mechanisms with an event loop that supports only this "minimal interface".
Essentially, that's my goal - to allow people who want to write (say) a Windows GUI event loop, or a Windows event loop based of WaitForXXXObject, or a Tkinter loop, or whatever, to *not* have to write their own implementation of synchronisation or future objects.
That may mean lifting the asyncio code and putting it into a separate library, to make the separation between "asyncio-dependent" and "general async" clearer. Or if asyncio's provisional status doesn't last long enough to do that, we may end up with an asyncio implementation and a separate (possibly 3rd party) "general" implementation.
Paul.
Hello,
On Wed, 6 May 2015 09:27:16 +0100 Paul Moore p.f.moore@gmail.com wrote:
On 6 May 2015 at 07:46, Greg Ewing greg.ewing@canterbury.ac.nz wrote:
Another problem with the "core" idea is that you can't start with an event loop that "just does scheduling" and then add on other features such as I/O *from the outside*. There has to be some point at which everything comes together, which means choosing something like select() or poll() or I/O completion queues, and build that into the heart of your event loop. At that point it's no longer something with a simple core.
[]
So, to some extent (how far is something I'd need to code up a loop to confirm) you can build the Futures and synchronisation mechanisms with an event loop that supports only this "minimal interface".
Essentially, that's my goal - to allow people who want to write (say) a Windows GUI event loop, or a Windows event loop based of WaitForXXXObject, or a Tkinter loop, or whatever, to *not* have to write their own implementation of synchronisation or future objects.
That may mean lifting the asyncio code and putting it into a separate library, to make the separation between "asyncio-dependent" and "general async" clearer. Or if asyncio's provisional status doesn't last long enough to do that, we may end up with an asyncio implementation and a separate (possibly 3rd party) "general" implementation.
MicroPython has alternative implementation of asyncio subset. It's structured as a generic scheduler component "uasyncio.core" https://github.com/micropython/micropython-lib/blob/master/uasyncio.core/uas... (170 total lines) and "uasyncio" which adds I/O scheduling on top of it: https://github.com/micropython/micropython-lib/blob/master/uasyncio/uasyncio...
"uasyncio.core" can be used separately, and is intended for usage as such on e.g. microcontrollers. It's built around native Python concept of coroutines (plus callbacks). It doesn't include concept of futures. They can be added as an extension built on top, but so far I didn't see need for that, while having developed a web picoframework for uasyncio (https://github.com/pfalcon/picoweb)
On Wed, May 6, 2015 at 1:27 AM, Paul Moore p.f.moore@gmail.com wrote:
On 6 May 2015 at 07:46, Greg Ewing greg.ewing@canterbury.ac.nz wrote:
Another problem with the "core" idea is that you can't start with an event loop that "just does scheduling" and then add on other features such as I/O *from the outside*. There has to be some point at which everything comes together, which means choosing something like select() or poll() or I/O completion queues, and build that into the heart of your event loop. At that point it's no longer something with a simple core.
Looking at asyncio.queues, the only features it needs are:
- asyncio.events.get_event_loop()
- asyncio.futures.Future - creating a standalone Future
- asyncio.locks.Event
- @coroutine
locks.Event in turn only needs the other 3 items. And you can ignore get_event_loop() as it's only used to get the default loop, you can pass in your own.
And asyncio.futures only uses get_event_loop (and _format_callback) from asyncio.events.
Futures require the loop to support:
- call_soon
- call_exception_handler
- get_debug
So, to some extent (how far is something I'd need to code up a loop to confirm) you can build the Futures and synchronisation mechanisms with an event loop that supports only this "minimal interface".
Essentially, that's my goal - to allow people who want to write (say) a Windows GUI event loop, or a Windows event loop based of WaitForXXXObject, or a Tkinter loop, or whatever, to *not* have to write their own implementation of synchronisation or future objects.
That may mean lifting the asyncio code and putting it into a separate library, to make the separation between "asyncio-dependent" and "general async" clearer. Or if asyncio's provisional status doesn't last long enough to do that, we may end up with an asyncio implementation and a separate (possibly 3rd party) "general" implementation.
This is actually a great idea, and I encourage you to go forward with it. The biggest piece missing from your inventory is probably Task, which is needed to wrap a Future around a coroutine.
I expect you'll also want to build cancellation into your "base async framework"; and the primitives to wait for multiple awaitables. The next step would be some mechanism to implement call_later()/call_at() (but this needs to be pluggable since for a "real" event loop it needs to be implemented by the basic I/O selector).
If you can get this working it would be great to include this in the stdlib as a separate "asynclib" library. The original asyncio library would then be a specific implementation (using a subclass of asynclib.EventLoop) that adds I/O, subprocesses, and integrates with the selectors module (or with IOCP, on Windows).
I don't see any particular hurry to get this in before 3.5; the refactoring of asyncio can be done later, in a backward compatible way. It would be a good way to test the architecture of asyncio!
On 6 May 2015 at 16:46, Guido van Rossum guido@python.org wrote:
This is actually a great idea, and I encourage you to go forward with it. The biggest piece missing from your inventory is probably Task, which is needed to wrap a Future around a coroutine.
OK, I've been doing some work on this. You're right, the asyncio framework makes Future a key component.
But I'm not 100% sure why Future (and Task) have to be so fundamental. Ignoring cancellation (see below!) I can build pretty much all of a basic event loop, plus equivalents of the asyncio locks and queues modules, without needing the concept of a Future at all. The create_task function becomes simply a function to add a coroutine to the ready queue, in this context. I can't return a Task (because I haven't implemented the Task or Future classes) but I don't actually know what significant functionality is lost as a result - is there a reasonably accessible example of where using the return value from create_task is important anywhere?
A slightly more complicated issue is with the run_until_complete function, which takes a Future, and hence is fundamentally tied to the Future API. However, it seems to me that a "minimal" implementation could work by having a run_until_complete() that just took an awaitable (i.e., anything that you can yield from). Again, is there a specific reason that you ended up going with run_until_complete taking a Future rather than just a coroutine? I think (but haven't confirmed yet by implementing it) that it should be possible to create a coroutine that acts like a Future, in the sense that you can tell it from outside (via send()) that it's completed and set its return value. But this is all theory, and if you have any practical experience that shows I'm going down a dead end, I'd be glad to know.
I'm not sure how useful this line of attack will be - if the API isn't compatible with asyncio.BaseEventLoop, it's not very useful in practice. On the other hand, if I can build a loop without Future or Task classes, it may indicate that those classes aren't quite as fundamental as asyncio makes them (which may allow some simplifications or generalisations).
I expect you'll also want to build cancellation into your "base async framework"; and the primitives to wait for multiple awaitables. The next step would be some mechanism to implement call_later()/call_at() (but this needs to be pluggable since for a "real" event loop it needs to be implemented by the basic I/O selector).
These are where I suspect I'll have the most trouble if I haven't got a solid understanding of the role of the Future and Task classes (or alternatively, how to avoid them :-)) So I'm holding off on worrying about them for now. But certainly they need to be covered. In particular, call_later/call_at are the only "generic" example of any form of wait that actually *waits*, rather than returning immediately. So as you say, implementing them will show how the basic mechanism can be extended with a "real" selector (whether for I/O, or GUI events, or whatever).
If you can get this working it would be great to include this in the stdlib as a separate "asynclib" library. The original asyncio library would then be a specific implementation (using a subclass of asynclib.EventLoop) that adds I/O, subprocesses, and integrates with the selectors module (or with IOCP, on Windows).
One thing I've not really considered in the above, is how a refactoring like this would work. Ignoring the "let's try to remove the Future class" approach above, my "basic event loop" is mostly just an alternative implementation of an event loop (or maybe an alternative policy - I'm not sure I understand the need for policies yet). So it may simply be a case of ripping coroutines.py, futures.py, locks.py, log.py, queues.py, and tasks.py out of asyncio and adding a new equivalent of events.py with my "minimal" loop in it. (So far, when I've tried to do that I get hit with some form of circular import problem - I've not worked out why yet, or how asyncio avoids the same problem).
That in itself would probably be a useful refactoring, splitting out the IO aspects of asyncio from the event loop / async aspects.
I don't see any particular hurry to get this in before 3.5; the refactoring of asyncio can be done later, in a backward compatible way. It would be a good way to test the architecture of asyncio!
Agreed. It's also not at all clear to me how the new async/await syntax would fit in with this, so that probably needs some time to settle down. For example, in Python 3.5 would run_until_complete take an awaitable rather than a Future?
Paul
On Mon, May 11, 2015 at 1:37 PM, Paul Moore p.f.moore@gmail.com wrote:
On 6 May 2015 at 16:46, Guido van Rossum guido@python.org wrote:
This is actually a great idea, and I encourage you to go forward with it. The biggest piece missing from your inventory is probably Task, which is needed to wrap a Future around a coroutine.
OK, I've been doing some work on this. You're right, the asyncio framework makes Future a key component.
But I'm not 100% sure why Future (and Task) have to be so fundamental. Ignoring cancellation (see below!) I can build pretty much all of a basic event loop, plus equivalents of the asyncio locks and queues modules, without needing the concept of a Future at all. The create_task function becomes simply a function to add a coroutine to the ready queue, in this context. I can't return a Task (because I haven't implemented the Task or Future classes) but I don't actually know what significant functionality is lost as a result - is there a reasonably accessible example of where using the return value from create_task is important anywhere?
In asyncio the Task object is used to wait for the result. Of course if all you need is to wait for the result you don't need to call create_task() -- so in your situation it's uninteresting. But Task is needed for cancellation and Future is needed so I/O completion can be implemented using callback functions.
A slightly more complicated issue is with the run_until_complete function, which takes a Future, and hence is fundamentally tied to the Future API. However, it seems to me that a "minimal" implementation could work by having a run_until_complete() that just took an awaitable (i.e., anything that you can yield from). Again, is there a specific reason that you ended up going with run_until_complete taking a Future rather than just a coroutine?
Actually it takes a Future *or* a coroutine. (The docs or the arg name may be confusing.) In asyncio, pretty much everything that takes one takes the other.
I think (but haven't confirmed yet by implementing it) that it should be possible to create a coroutine that acts like a Future, in the sense that you can tell it from outside (via send()) that it's completed and set its return value. But this is all theory, and if you have any practical experience that shows I'm going down a dead end, I'd be glad to know.
I don't know -- I never explored that.
I'm not sure how useful this line of attack will be - if the API isn't compatible with asyncio.BaseEventLoop, it's not very useful in practice. On the other hand, if I can build a loop without Future or Task classes, it may indicate that those classes aren't quite as fundamental as asyncio makes them (which may allow some simplifications or generalisations).
Have you tried to implement waiting for I/O yet?
OTOH you may look at micropython's uasyncio -- IIRC it doesn't have Futures and it definitely has I/O waiting.
I expect you'll also want to build cancellation into your "base async framework"; and the primitives to wait for multiple awaitables. The next step would be some mechanism to implement call_later()/call_at() (but
this
needs to be pluggable since for a "real" event loop it needs to be implemented by the basic I/O selector).
These are where I suspect I'll have the most trouble if I haven't got a solid understanding of the role of the Future and Task classes (or alternatively, how to avoid them :-)) So I'm holding off on worrying about them for now. But certainly they need to be covered. In particular, call_later/call_at are the only "generic" example of any form of wait that actually *waits*, rather than returning immediately. So as you say, implementing them will show how the basic mechanism can be extended with a "real" selector (whether for I/O, or GUI events, or whatever).
Right.
If you can get this working it would be great to include this in the
stdlib
as a separate "asynclib" library. The original asyncio library would
then be
a specific implementation (using a subclass of asynclib.EventLoop) that
adds
I/O, subprocesses, and integrates with the selectors module (or with
IOCP,
on Windows).
One thing I've not really considered in the above, is how a refactoring like this would work. Ignoring the "let's try to remove the Future class" approach above, my "basic event loop" is mostly just an alternative implementation of an event loop (or maybe an alternative policy - I'm not sure I understand the need for policies yet).
A policy is mostly a wrapper around an event loop factory plus state that records the current event loop.
So it may simply be a case of ripping coroutines.py, futures.py, locks.py, log.py, queues.py, and tasks.py out of asyncio and adding a new equivalent of events.py with my "minimal" loop in it. (So far, when I've tried to do that I get hit with some form of circular import problem - I've not worked out why yet, or how asyncio avoids the same problem).
That sounds like a surface problem. Keep on debugging. :-)
That in itself would probably be a useful refactoring, splitting out the IO aspects of asyncio from the event loop / async aspects.
Well, if you can.
I don't see any particular hurry to get this in before 3.5; the
refactoring
of asyncio can be done later, in a backward compatible way. It would be a good way to test the architecture of asyncio!
Agreed. It's also not at all clear to me how the new async/await syntax would fit in with this, so that probably needs some time to settle down. For example, in Python 3.5 would run_until_complete take an awaitable rather than a Future?
It doesn't need to change -- it already calls async() on its argument before doing anything (though with PEP 492 that function will be renamed to ensure_future()).
On Mon, May 11, 2015 at 6:05 PM, Guido van Rossum guido@python.org wrote:
OTOH you may look at micropython's uasyncio -- IIRC it doesn't have Futures and it definitely has I/O waiting.
Here's a sketch of an *extremely* minimal main loop that can do I/O without Futures, and might be suitable as a PEP example. (Certainly, it would be hard to write a *simpler* example than this, since it doesn't even use any *classes* or require any specially named methods, works with present-day generators, and is (I think) both 2.x/3.x compatible.)
coroutines = [] # round-robin of currently "running" coroutines
def schedule(coroutine, val=None, err=None): coroutines.insert(0, (coroutine, val, err))
def runLoop(): while coroutines: (coroutine, val, err) = coroutines.pop() try: if err is not None: suspend = coroutine.throw(err) else suspend = coroutine.send(val) except StopIteration: # coroutine is finished, so don't reschedule it continue
except Exception: # framework-specific detail (i.e., log it, send # to an error handling coroutine, or just stop the program # Here, we just ignore it and stop the coroutine continue
else: if hasattr(suspend, '__call__') and suspend(coroutine): continue else: # put it back on the round-robin list schedule(coroutine)
To use it, `schedule()` one or more coroutines, then call `runLoop()`, which will run as long as there are things to do. Each coroutine scheduled must yield *thunks*: callable objects that take a coroutine as a parameter, and return True if the coroutine should be suspended, or False if it should continue to run. If the thunk returns true, that means the thunk has taken responsibility for arranging to `schedule()` the coroutine with a value or error when it's time to send it the result of the suspension.
You might be asking, "wait, but where's the I/O?" Why, in a coroutine, of course...
readers = {} writers = {} timers = []
def readable(fileno): """yield readable(fileno) resumes when fileno is readable""" def suspend(coroutine): readers[fileno] = coroutine return True return suspend
def writable(fileno): """yield writable(fileno) resumes when fileno is writable""" def suspend(coroutine): writers[fileno] = coroutine return True return suspend
def sleepFor(seconds): """yield sleepFor(seconds) resumes after that much time""" return suspendUntil(time.time() + seconds)
def suspendUntil(timestamp): """yield suspendUntil(timestamp) resumes when that time is reached""" def suspend(coroutine) heappush(timers, (timestamp, coroutine) return suspend
def doIO(): while coroutines or readers or writers or timers:
# Resume scheduled tasks while timers and timers[0][0] <= time.time(): ts, coroutine = heappop(timers) schedule(coroutine)
if readers or writers: if coroutines: # Other tasks are running; use minimal timeout timeout = 0.001 else if timers: timeout = max(timers[0][0] - time.time(), 0.001) else: timeout = 0 # take as long as necessary r, w, e = select(readers, writers, [], timeout) for rr in r: schedule(readers.pop(rr)) for ww in w: schedule(writers.pop(ww))
yield # allow other coroutines to run
schedule(doIO()) # run the I/O loop as a coroutine
(This is painfully incomplete for a real framework, but it's a rough sketch of how one of peak.events' first drafts worked, circa early 2004.)
Basically, you just need a coroutine whose job is to resume coroutines whose scheduled time has arrived, or whose I/O is ready. And of course, some data structures to keep track of such things, and an API to update the data structures and suspend the coroutines. The I/O loop exits once there are no more running tasks and nothing waiting on I/O... which will also exit the runLoop. (A bit like a miniature version of NodeJS for Python.)
And, while you need to preferably have only *one* such I/O coroutine (to prevent busy-waiting), the I/O coroutine is completely replaceable. All that's required to implement one is that the core runloop expose the count of active coroutines. (Notice that, apart from checking the length of `coroutines`, the I/O loop shown above uses only the public `schedule()` API and the exposed thunk-suspension protocol to do its thing.)
Also, note that you *can* indeed have multiple I/O coroutines running at the same time, as long as you don't mind busy-waiting. In fact, you can refactor this to move the time-based scheduling inside the runloop, and expose the "time until next task" and "number of running non-I/O coroutines" to allow multiple I/O waiters to co-ordinate and avoid busy-waiting. (A later version of peak.events did this, though it really wasn't to allow multiple I/O waiters, so much as to simplify I/O waiters by providing a core time-scheduler, and to support simulated time for running tests.)
So, there's definitely no requirement for I/O to be part of a "core" runloop system. The overall approach is *extremely* open to extension, hardcodes next to nothing, and is super-easy to write new yieldables for, since they need only have a method (or function) that returns a suspend function.
At the time I *first* implemented this approach in '03/'04, I hadn't thought of using plain functions as suspend targets; I used objects with a `shouldSupend()` method. But in fairness, I was working with Python 2.2 and closures were still a pretty new feature back then. ;-)
Since then, though, I've seen this approach implemented elsewhere using closures in almost exactly this way. For example, the `co` library for Javascript implements almost exactly the above sketch's approach, in not much more code. It just uses the built-in Javascript event loop facilities, and supports yielding other things besides thunks. (Its thunks also don't return a value, and take a callback rather than a coroutine. But these are superficial differences.)
This approach is super-flexible in practice, as there are a ton of add-on libraries for `co` that implement their control flow using these thunks. You can indeed fully generalize control flow in such terms, without the need for futures or similar objects. For example, if you want to provide sugar for yielding to futures or other types of objects, you just write a thunk-returning function or method, e.g.:
def await_future(future): def suspend(coroutine): @future.add_done_callback def resume(future): err = future.exception() if err: schedule(coroutine, None, future.exception()) else: schedule(coroutine, future.result()) return True return suspend
So `yield await_future(someFuture)` will arrange for suspension until the future is ready. Libraries or frameworks can also be written that wrap a generator with one that provides automatic translation to thunks for a variety of types or protocols. Similarly, you can write functions that take multiple awaitables, or that provide cancellation, etc. on top of thunks.