An async facade? (was Re: [Python-Dev] Socket timeout and completion based sockets)
[ It's tough coming up with unique subjects for these async discussions. I've dropped python-dev and cc'd python-ideas instead as the stuff below follows on from the recent msgs. ] TL;DR version: Provide an async interface that is implicitly asynchronous; all calls return immediately, callbacks are used to handle success/error/timeout. class async: def accept(): def read(): def write(): def getaddrinfo(): def submit_work(): How the asynchronicity (not a word, I know) is achieved is an implementation detail, and will differ for each platform. (Windows will be able to leverage all its async APIs to full extent, Linux et al can keep mimicking asynchronicity via the usual non-blocking + multiplexing (poll/kqueue etc), thread pools, etc.) On Wed, Nov 28, 2012 at 11:15:07AM -0800, Glyph wrote:
On Nov 28, 2012, at 12:04 PM, Guido van Rossum <guido@python.org> wrote: I would also like to bring up <https://github.com/lvh/async-pep> again.
So, I spent yesterday working on the IOCP/async stuff. The saw this PEP and the sample async/abstract.py. That got me thinking: why don't we have a low-level async facade/API? Something where all calls are implicitly asynchronous. On systems with extensive support for asynchronous 'stuff', primarily Windows and AIX/Solaris to a lesser extent, we'd be able to leverage the platform-provided async facilities to full effect. On other platforms, we'd fake it, just like we do now, with select, poll/epoll, kqueue and non-blocking sockets. Consider the following: class Callback: __slots__ = [ 'success', 'failure', 'timeout', 'cancel', ] class AsyncEngine: def getaddrinfo(host, port, ..., cb): ... def getaddrinfo_then_connect(.., callbacks=(cb1, cb2)) ... def accept(sock, cb): ... def accept_then_write(sock, buf, (cb1, cb2)): ... def accept_then_expect_line(sock, line, (cb1, cb2)): ... def accept_then_expect_multiline_regex(sock, regex, cb): ... def read_until(fd_or_sock, bytes, cb): ... def read_all(fd_or_sock, cb): return self.read_until(fd_or_sock, EOF, cb) def read_until_lineglob(fd_or_sock, cb): ... def read_until_regex(fd_or_sock, cb): ... def read_chunk(fd_or_sock, chunk_size, cb): ... def write(fd_or_sock, buf, cb): ... def write_then_expect_line(fd_or_sock, buf, (cb1, cb2)): ... def connect_then_expect_line(..): ... def connect_then_write_line(..): ... def submit_work(callable, cb): ... def run_once(..): """Run the event loop once.""" def run(..): """Keep running the event loop until exit.""" All methods always take at least one callback. Chained methods can take multiple callbacks (i.e. accept_then_expect_line()). You fill in the success, failure (both callables) and timeout (an int) slots. The engine will populate cb.cancel with a callable that you can call at any time to (try and) cancel the IO operation. (How quickly that works depends on the underlying implementation.) I like this approach for two reasons: a) it allows platforms with great async support to work at their full potential, and b) it doesn't leak implementation details like non-blocking sockets, fds, multiplexing (poll/kqueue/select, IOCP, etc). Those are all details that are taken care of by the underlying implementation. getaddrinfo is a good example here. Guido, in tulip, you have this implemented as: def getaddrinfo(host, port, af=0, socktype=0, proto=0): infos = yield from scheduling.call_in_thread( socket.getaddrinfo, host, port, af, socktype, proto ) That's very implementation specific. It assumes the only way to perform an async getaddrinfo is by calling it from a separate thread. On Windows, there's native support for async getaddrinfo(), which we wouldn't be able to leverage here. The biggest benefit is that no assumption is made as to how the asynchronicity is achieved. Note that I didn't mention IOCP or kqueue or epoll once. Those are all implementation details that the writer of an asynchronous Python app doesn't need to care about. Thoughts? Trent.
Trent Nelson wrote:
TL;DR version:
Provide an async interface that is implicitly asynchronous; all calls return immediately, callbacks are used to handle success/error/timeout.
This is the central idea of what I've been advocating - the use of Future. Rather than adding an extra parameter to the initial call, asynchronous methods return an object that can have callbacks added.
The biggest benefit is that no assumption is made as to how the asynchronicity is achieved. Note that I didn't mention IOCP or kqueue or epoll once. Those are all implementation details that the writer of an asynchronous Python app doesn't need to care about.
I think this is why I've been largely ignored (except by Guido) - I don't even mention sockets, let alone the implementation details :). There are all sorts of operations that can be run asynchronously that do not involve sockets, though it seems that the driving force behind most of the effort is just to make really fast web servers. My code contribution is at http://bitbucket.org/stevedower/wattle, though I have not updated it in a while and there are certainly aspects that I would change. You may find it interesting if you haven't seen it yet. Cheers, Steve -----Original Message----- From: Python-ideas [mailto:python-ideas-bounces+steve.dower=microsoft.com@python.org] On Behalf Of Trent Nelson Sent: Friday, November 30, 2012 0814 To: Guido van Rossum Cc: Glyph; python-ideas@python.org Subject: [Python-ideas] An async facade? (was Re: [Python-Dev] Socket timeout and completion based sockets) [ It's tough coming up with unique subjects for these async discussions. I've dropped python-dev and cc'd python-ideas instead as the stuff below follows on from the recent msgs. ] TL;DR version: Provide an async interface that is implicitly asynchronous; all calls return immediately, callbacks are used to handle success/error/timeout. class async: def accept(): def read(): def write(): def getaddrinfo(): def submit_work(): How the asynchronicity (not a word, I know) is achieved is an implementation detail, and will differ for each platform. (Windows will be able to leverage all its async APIs to full extent, Linux et al can keep mimicking asynchronicity via the usual non-blocking + multiplexing (poll/kqueue etc), thread pools, etc.) On Wed, Nov 28, 2012 at 11:15:07AM -0800, Glyph wrote:
On Nov 28, 2012, at 12:04 PM, Guido van Rossum <guido@python.org> wrote: I would also like to bring up <https://github.com/lvh/async-pep> again.
So, I spent yesterday working on the IOCP/async stuff. The saw this PEP and the sample async/abstract.py. That got me thinking: why don't we have a low-level async facade/API? Something where all calls are implicitly asynchronous. On systems with extensive support for asynchronous 'stuff', primarily Windows and AIX/Solaris to a lesser extent, we'd be able to leverage the platform-provided async facilities to full effect. On other platforms, we'd fake it, just like we do now, with select, poll/epoll, kqueue and non-blocking sockets. Consider the following: class Callback: __slots__ = [ 'success', 'failure', 'timeout', 'cancel', ] class AsyncEngine: def getaddrinfo(host, port, ..., cb): ... def getaddrinfo_then_connect(.., callbacks=(cb1, cb2)) ... def accept(sock, cb): ... def accept_then_write(sock, buf, (cb1, cb2)): ... def accept_then_expect_line(sock, line, (cb1, cb2)): ... def accept_then_expect_multiline_regex(sock, regex, cb): ... def read_until(fd_or_sock, bytes, cb): ... def read_all(fd_or_sock, cb): return self.read_until(fd_or_sock, EOF, cb) def read_until_lineglob(fd_or_sock, cb): ... def read_until_regex(fd_or_sock, cb): ... def read_chunk(fd_or_sock, chunk_size, cb): ... def write(fd_or_sock, buf, cb): ... def write_then_expect_line(fd_or_sock, buf, (cb1, cb2)): ... def connect_then_expect_line(..): ... def connect_then_write_line(..): ... def submit_work(callable, cb): ... def run_once(..): """Run the event loop once.""" def run(..): """Keep running the event loop until exit.""" All methods always take at least one callback. Chained methods can take multiple callbacks (i.e. accept_then_expect_line()). You fill in the success, failure (both callables) and timeout (an int) slots. The engine will populate cb.cancel with a callable that you can call at any time to (try and) cancel the IO operation. (How quickly that works depends on the underlying implementation.) I like this approach for two reasons: a) it allows platforms with great async support to work at their full potential, and b) it doesn't leak implementation details like non-blocking sockets, fds, multiplexing (poll/kqueue/select, IOCP, etc). Those are all details that are taken care of by the underlying implementation. getaddrinfo is a good example here. Guido, in tulip, you have this implemented as: def getaddrinfo(host, port, af=0, socktype=0, proto=0): infos = yield from scheduling.call_in_thread( socket.getaddrinfo, host, port, af, socktype, proto ) That's very implementation specific. It assumes the only way to perform an async getaddrinfo is by calling it from a separate thread. On Windows, there's native support for async getaddrinfo(), which we wouldn't be able to leverage here. The biggest benefit is that no assumption is made as to how the asynchronicity is achieved. Note that I didn't mention IOCP or kqueue or epoll once. Those are all implementation details that the writer of an asynchronous Python app doesn't need to care about. Thoughts? Trent. _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
Futures or callbacks, that's the question... Richard and I have even been considering APIs like this: res = obj.some_call(<args>) if isinstance(res, Future): res = yield res or res = obj.some_call(<args>) if res is None: res = yield <magic> where <magic> is some call on the scheduler/eventloop/proactor that pulls the future out of a hat. The idea of the first version is simply to avoid the Future when the result happens to be immediately ready (e.g. when calling readline() on some buffering stream, most of the time the next line is already in the buffer); the point of the second version is that "res is None" is way faster than "isinstance(res, Future)" -- however the magic is a little awkward. The debate is still open. --Guido On Fri, Nov 30, 2012 at 9:57 AM, Steve Dower <Steve.Dower@microsoft.com> wrote:
Trent Nelson wrote:
TL;DR version:
Provide an async interface that is implicitly asynchronous; all calls return immediately, callbacks are used to handle success/error/timeout.
This is the central idea of what I've been advocating - the use of Future. Rather than adding an extra parameter to the initial call, asynchronous methods return an object that can have callbacks added.
The biggest benefit is that no assumption is made as to how the asynchronicity is achieved. Note that I didn't mention IOCP or kqueue or epoll once. Those are all implementation details that the writer of an asynchronous Python app doesn't need to care about.
I think this is why I've been largely ignored (except by Guido) - I don't even mention sockets, let alone the implementation details :). There are all sorts of operations that can be run asynchronously that do not involve sockets, though it seems that the driving force behind most of the effort is just to make really fast web servers.
My code contribution is at http://bitbucket.org/stevedower/wattle, though I have not updated it in a while and there are certainly aspects that I would change. You may find it interesting if you haven't seen it yet.
Cheers, Steve
-----Original Message----- From: Python-ideas [mailto:python-ideas-bounces+steve.dower=microsoft.com@python.org] On Behalf Of Trent Nelson Sent: Friday, November 30, 2012 0814 To: Guido van Rossum Cc: Glyph; python-ideas@python.org Subject: [Python-ideas] An async facade? (was Re: [Python-Dev] Socket timeout and completion based sockets)
[ It's tough coming up with unique subjects for these async discussions. I've dropped python-dev and cc'd python-ideas instead as the stuff below follows on from the recent msgs. ]
TL;DR version:
Provide an async interface that is implicitly asynchronous; all calls return immediately, callbacks are used to handle success/error/timeout.
class async: def accept(): def read(): def write(): def getaddrinfo(): def submit_work():
How the asynchronicity (not a word, I know) is achieved is an implementation detail, and will differ for each platform.
(Windows will be able to leverage all its async APIs to full extent, Linux et al can keep mimicking asynchronicity via the usual non-blocking + multiplexing (poll/kqueue etc), thread pools, etc.)
On Wed, Nov 28, 2012 at 11:15:07AM -0800, Glyph wrote:
On Nov 28, 2012, at 12:04 PM, Guido van Rossum <guido@python.org> wrote: I would also like to bring up <https://github.com/lvh/async-pep> again.
So, I spent yesterday working on the IOCP/async stuff. The saw this PEP and the sample async/abstract.py. That got me thinking: why don't we have a low-level async facade/API? Something where all calls are implicitly asynchronous.
On systems with extensive support for asynchronous 'stuff', primarily Windows and AIX/Solaris to a lesser extent, we'd be able to leverage the platform-provided async facilities to full effect.
On other platforms, we'd fake it, just like we do now, with select, poll/epoll, kqueue and non-blocking sockets.
Consider the following:
class Callback: __slots__ = [ 'success', 'failure', 'timeout', 'cancel', ]
class AsyncEngine: def getaddrinfo(host, port, ..., cb): ...
def getaddrinfo_then_connect(.., callbacks=(cb1, cb2)) ...
def accept(sock, cb): ...
def accept_then_write(sock, buf, (cb1, cb2)): ...
def accept_then_expect_line(sock, line, (cb1, cb2)): ...
def accept_then_expect_multiline_regex(sock, regex, cb): ...
def read_until(fd_or_sock, bytes, cb): ...
def read_all(fd_or_sock, cb): return self.read_until(fd_or_sock, EOF, cb)
def read_until_lineglob(fd_or_sock, cb): ...
def read_until_regex(fd_or_sock, cb): ...
def read_chunk(fd_or_sock, chunk_size, cb): ...
def write(fd_or_sock, buf, cb): ...
def write_then_expect_line(fd_or_sock, buf, (cb1, cb2)): ...
def connect_then_expect_line(..): ...
def connect_then_write_line(..): ...
def submit_work(callable, cb): ...
def run_once(..): """Run the event loop once."""
def run(..): """Keep running the event loop until exit."""
All methods always take at least one callback. Chained methods can take multiple callbacks (i.e. accept_then_expect_line()). You fill in the success, failure (both callables) and timeout (an int) slots. The engine will populate cb.cancel with a callable that you can call at any time to (try and) cancel the IO operation. (How quickly that works depends on the underlying implementation.)
I like this approach for two reasons: a) it allows platforms with great async support to work at their full potential, and b) it doesn't leak implementation details like non-blocking sockets, fds, multiplexing (poll/kqueue/select, IOCP, etc). Those are all details that are taken care of by the underlying implementation.
getaddrinfo is a good example here. Guido, in tulip, you have this implemented as:
def getaddrinfo(host, port, af=0, socktype=0, proto=0): infos = yield from scheduling.call_in_thread( socket.getaddrinfo, host, port, af, socktype, proto )
That's very implementation specific. It assumes the only way to perform an async getaddrinfo is by calling it from a separate thread. On Windows, there's native support for async getaddrinfo(), which we wouldn't be able to leverage here.
The biggest benefit is that no assumption is made as to how the asynchronicity is achieved. Note that I didn't mention IOCP or kqueue or epoll once. Those are all implementation details that the writer of an asynchronous Python app doesn't need to care about.
Thoughts?
Trent. _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
-- --Guido van Rossum (python.org/~guido)
Guido van Rossum wrote:
Futures or callbacks, that's the question...
I know the C++ standards committee is looking at the same thing right now, and they're probably going to provide both: futures for those who prefer them (which is basically how the code looks) and callbacks for when every cycle is critical or if the developer prefers them. C++ has the advantage that futures can often be optimized out, so implementing a Future-based wrapper around a callback-based function is very cheap, but the two-level API will probably happen.
Richard and I have even been considering APIs like this:
res = obj.some_call(<args>) if isinstance(res, Future): res = yield res
or
res = obj.some_call(<args>) if res is None: res = yield <magic>
where <magic> is some call on the scheduler/eventloop/proactor that pulls the future out of a hat.
The idea of the first version is simply to avoid the Future when the result happens to be immediately ready (e.g. when calling readline() on some buffering stream, most of the time the next line is already in the buffer); the point of the second version is that "res is None" is way faster than "isinstance(res, Future)" -- however the magic is a little awkward.
The debate is still open.
How about: value, future = obj.some_call(...) if value is None: value = yield future Or: future = obj.some_call(...) if future.done(): value = future.result() else: value = yield future I like the second one because it doesn't require the methods to do anything special to support always yielding vs. only yielding futures that aren't ready - the caller gets to decide how performant they want to be. (I would also like to see Future['s base class] be implemented in C and possibly even preallocated to reduce overhead. 'done()' could also be an attribute rather than a method, though that would break the existing Future class.) Cheers, Steve
On Fri, Nov 30, 2012 at 11:18 AM, Steve Dower <Steve.Dower@microsoft.com> wrote:
Guido van Rossum wrote:
Futures or callbacks, that's the question...
I know the C++ standards committee is looking at the same thing right now, and they're probably going to provide both: futures for those who prefer them (which is basically how the code looks) and callbacks for when every cycle is critical or if the developer prefers them. C++ has the advantage that futures can often be optimized out, so implementing a Future-based wrapper around a callback-based function is very cheap, but the two-level API will probably happen.
Well, for Python 3 we will definitely have two layers already: callbacks and yield-from-based-coroutines. The question is whether there's room for Futures in between (I like layers of abstraction, but I don't like having too many layers).
Richard and I have even been considering APIs like this:
res = obj.some_call(<args>) if isinstance(res, Future): res = yield res
or
res = obj.some_call(<args>) if res is None: res = yield <magic>
where <magic> is some call on the scheduler/eventloop/proactor that pulls the future out of a hat.
The idea of the first version is simply to avoid the Future when the result happens to be immediately ready (e.g. when calling readline() on some buffering stream, most of the time the next line is already in the buffer); the point of the second version is that "res is None" is way faster than "isinstance(res, Future)" -- however the magic is a little awkward.
The debate is still open.
How about:
value, future = obj.some_call(...) if value is None: value = yield future
Also considered; I don't really like having to allocate a tuple here (which is impossible to optimize out completely, even though its allocation may use a fast free list).
Or:
future = obj.some_call(...) if future.done(): value = future.result() else: value = yield future
That seems the most expensive option of all because of the call to done() that's always there.
I like the second one because it doesn't require the methods to do anything special to support always yielding vs. only yielding futures that aren't ready - the caller gets to decide how performant they want to be. (I would also like to see Future['s base class] be implemented in C and possibly even preallocated to reduce overhead. 'done()' could also be an attribute rather than a method, though that would break the existing Future class.)
Note that in all cases the places where this idiom is *used* should be few and far between -- it should only be needed in the "glue" between the callback-based world and the coroutine-based world. You'd only be writing new calls like this if you're writing new glue, which should only be necessary if you are writing wrappers for (probably platform-specific) new primitive operations supported by the lowest level event loop. This is why I am looking for the pattern that executes fastest rather than the pattern that is easiest to write for end users -- the latter would be to always return a Future and let the user write res = yield obj.some_call(<args>) -- --Guido van Rossum (python.org/~guido)
On 30.11.12 20:29, Guido van Rossum wrote:
Guido van Rossum wrote:
Futures or callbacks, that's the question... I know the C++ standards committee is looking at the same thing right now, and they're probably going to provide both: futures for those who prefer them (which is basically how the code looks) and callbacks for when every cycle is critical or if the developer prefers them. C++ has the advantage that futures can often be optimized out, so implementing a Future-based wrapper around a callback-based function is very cheap, but the two-level API will probably happen. Well, for Python 3 we will definitely have two layers already: callbacks and yield-from-based-coroutines. The question is whether
On Fri, Nov 30, 2012 at 11:18 AM, Steve Dower <Steve.Dower@microsoft.com> wrote: there's room for Futures in between (I like layers of abstraction, but I don't like having too many layers).
So far I agree very much.
...
The debate is still open.
How about:
value, future = obj.some_call(...) if value is None: value = yield future Also considered; I don't really like having to allocate a tuple here (which is impossible to optimize out completely, even though its allocation may use a fast free list).
A little remark: I do respect personal taste very much, and if a tuple can be avoided I'm in fore sure. But the argument of the cost of a tuple creation is something that even I no longer consider relevant, especially in a context of other constructs like yield-from which are (currently) not even efficient ( O(n)-wise ). The discussion should better stay design oriented and not consider little overhead by a constant factor. But I agree that returned tuples are not a nice pattern to be used all the time. cheers - chris -- Christian Tismer :^) <mailto:tismer@stackless.com> Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/
On Fri, 30 Nov 2012 11:04:09 -0800 Guido van Rossum <guido@python.org> wrote:
Futures or callbacks, that's the question...
Richard and I have even been considering APIs like this:
res = obj.some_call(<args>) if isinstance(res, Future): res = yield res
or
res = obj.some_call(<args>) if res is None: res = yield <magic>
where <magic> is some call on the scheduler/eventloop/proactor that pulls the future out of a hat.
The idea of the first version is simply to avoid the Future when the result happens to be immediately ready (e.g. when calling readline() on some buffering stream, most of the time the next line is already in the buffer); the point of the second version is that "res is None" is way faster than "isinstance(res, Future)" -- however the magic is a little awkward.
This premature optimization looks really ugly to me. I'm strongly -1 on both idioms. Regards Antoine.
On Fri, Nov 30, 2012 at 11:27 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Fri, 30 Nov 2012 11:04:09 -0800 Guido van Rossum <guido@python.org> wrote:
Futures or callbacks, that's the question...
Richard and I have even been considering APIs like this:
res = obj.some_call(<args>) if isinstance(res, Future): res = yield res
or
res = obj.some_call(<args>) if res is None: res = yield <magic>
where <magic> is some call on the scheduler/eventloop/proactor that pulls the future out of a hat.
The idea of the first version is simply to avoid the Future when the result happens to be immediately ready (e.g. when calling readline() on some buffering stream, most of the time the next line is already in the buffer); the point of the second version is that "res is None" is way faster than "isinstance(res, Future)" -- however the magic is a little awkward.
This premature optimization looks really ugly to me. I'm strongly -1 on both idioms.
Read my explanation in my response to Steve. -- --Guido van Rossum (python.org/~guido)
On Fri, Nov 30, 2012 at 2:37 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Guido van Rossum wrote:
Futures or callbacks, that's the question...
Richard and I have even been considering APIs like this:
res = obj.some_call(<args>) if isinstance(res, Future): res = yield res
I thought you had decided against the idea of yielding futures?
As a user-facing API style, yes. But this is meant for an internal API -- the equivalent of your bare 'yield'. If you want to, I can consider another style as well res = obj.some_call(<args>) if isinstance(res, Future): res.<magic_call>() yield But I don't see a fundamental advantage to this. -- --Guido van Rossum (python.org/~guido)
Guido van Rossum wrote:
Greg Ewing wrote:
Guido van Rossum wrote:
Futures or callbacks, that's the question...
Richard and I have even been considering APIs like this:
res = obj.some_call(<args>) if isinstance(res, Future): res = yield res
I thought you had decided against the idea of yielding futures?
As a user-facing API style, yes. But this is meant for an internal API -- the equivalent of your bare 'yield'. If you want to, I can consider another style as well
res = obj.some_call(<args>) if isinstance(res, Future): res.<magic_call>() yield
But I don't see a fundamental advantage to this.
I do, it completely avoids ever using yield from to pass values around when used for coroutines. If values are always yielded or never yielded then it is easy (or easier) to detect errors such as: def func(): data = yield from get_data_async() for x in data: yield x When values are sometimes yielded and sometimes not, it's much harder to reliably throw an error when a value was yielded. Always using bare yields lets the code calling __next__() (I forget whether we're calling this "scheduler"...) raise an error if the value is not None. Cheers, Steve
On Fri, Nov 30, 2012 at 3:32 PM, Steve Dower <Steve.Dower@microsoft.com> wrote:
Guido van Rossum wrote:
Greg Ewing wrote:
Guido van Rossum wrote:
Futures or callbacks, that's the question...
Richard and I have even been considering APIs like this:
res = obj.some_call(<args>) if isinstance(res, Future): res = yield res
I thought you had decided against the idea of yielding futures?
As a user-facing API style, yes. But this is meant for an internal API -- the equivalent of your bare 'yield'. If you want to, I can consider another style as well
res = obj.some_call(<args>) if isinstance(res, Future): res.<magic_call>() yield
But I don't see a fundamental advantage to this.
I do, it completely avoids ever using yield from to pass values around when used for coroutines.
If values are always yielded or never yielded then it is easy (or easier) to detect errors such as:
def func(): data = yield from get_data_async() for x in data: yield x
When values are sometimes yielded and sometimes not, it's much harder to reliably throw an error when a value was yielded. Always using bare yields lets the code calling __next__() (I forget whether we're calling this "scheduler"...) raise an error if the value is not None.
Good point. I'll keep this in mind. -- --Guido van Rossum (python.org/~guido)
On Nov 30, 2012, at 8:04 PM, Guido van Rossum <guido@python.org> wrote:
Futures or callbacks, that's the question…
I would strongly recommend Futures, most importantly because it seams to handle Threads more elegantly, since it is easier to move between Threads.
Richard and I have even been considering APIs like this:
res = obj.some_call(<args>) if isinstance(res, Future): res = yield res
or
res = obj.some_call(<args>) if res is None: res = yield <magic>
where <magic> is some call on the scheduler/eventloop/proactor that pulls the future out of a hat.
The idea of the first version is simply to avoid the Future when the result happens to be immediately ready (e.g. when calling readline() on some buffering stream, most of the time the next line is already in the buffer); the point of the second version is that "res is None" is way faster than "isinstance(res, Future)" -- however the magic is a little awkward.
The debate is still open.
Great :-) I understand that there are several layers involved (1) old style function call, 2) yield/coroutines and 3) threads) but I believe a model that handles all levels alike would be preferable. As a 3'rd API, consider: res = obj.some_call(<args>) self.other_call() print res the some_call() is *always" async and res i *always* a Future, 1) if executed in same thread it can be optimised out and be a normal function call 2) if coroutine it's a perfect time for t.switch() 3) if threads other_call() continues and res blocks if not ready Or maybe the notion of all objects running in separate coroutines/threads, all methods being async and all return values being Futures is something for Python 4? :-) (or PyLang an Erlang lookalike) br /rene
--Guido
On Fri, Nov 30, 2012 at 9:57 AM, Steve Dower <Steve.Dower@microsoft.com> wrote:
Trent Nelson wrote:
TL;DR version:
Provide an async interface that is implicitly asynchronous; all calls return immediately, callbacks are used to handle success/error/timeout.
This is the central idea of what I've been advocating - the use of Future. Rather than adding an extra parameter to the initial call, asynchronous methods return an object that can have callbacks added.
The biggest benefit is that no assumption is made as to how the asynchronicity is achieved. Note that I didn't mention IOCP or kqueue or epoll once. Those are all implementation details that the writer of an asynchronous Python app doesn't need to care about.
I think this is why I've been largely ignored (except by Guido) - I don't even mention sockets, let alone the implementation details :). There are all sorts of operations that can be run asynchronously that do not involve sockets, though it seems that the driving force behind most of the effort is just to make really fast web servers.
My code contribution is at http://bitbucket.org/stevedower/wattle, though I have not updated it in a while and there are certainly aspects that I would change. You may find it interesting if you haven't seen it yet.
Cheers, Steve
-----Original Message----- From: Python-ideas [mailto:python-ideas-bounces+steve.dower=microsoft.com@python.org] On Behalf Of Trent Nelson Sent: Friday, November 30, 2012 0814 To: Guido van Rossum Cc: Glyph; python-ideas@python.org Subject: [Python-ideas] An async facade? (was Re: [Python-Dev] Socket timeout and completion based sockets)
[ It's tough coming up with unique subjects for these async discussions. I've dropped python-dev and cc'd python-ideas instead as the stuff below follows on from the recent msgs. ]
TL;DR version:
Provide an async interface that is implicitly asynchronous; all calls return immediately, callbacks are used to handle success/error/timeout.
class async: def accept(): def read(): def write(): def getaddrinfo(): def submit_work():
How the asynchronicity (not a word, I know) is achieved is an implementation detail, and will differ for each platform.
(Windows will be able to leverage all its async APIs to full extent, Linux et al can keep mimicking asynchronicity via the usual non-blocking + multiplexing (poll/kqueue etc), thread pools, etc.)
On Wed, Nov 28, 2012 at 11:15:07AM -0800, Glyph wrote:
On Nov 28, 2012, at 12:04 PM, Guido van Rossum <guido@python.org> wrote: I would also like to bring up <https://github.com/lvh/async-pep> again.
So, I spent yesterday working on the IOCP/async stuff. The saw this PEP and the sample async/abstract.py. That got me thinking: why don't we have a low-level async facade/API? Something where all calls are implicitly asynchronous.
On systems with extensive support for asynchronous 'stuff', primarily Windows and AIX/Solaris to a lesser extent, we'd be able to leverage the platform-provided async facilities to full effect.
On other platforms, we'd fake it, just like we do now, with select, poll/epoll, kqueue and non-blocking sockets.
Consider the following:
class Callback: __slots__ = [ 'success', 'failure', 'timeout', 'cancel', ]
class AsyncEngine: def getaddrinfo(host, port, ..., cb): ...
def getaddrinfo_then_connect(.., callbacks=(cb1, cb2)) ...
def accept(sock, cb): ...
def accept_then_write(sock, buf, (cb1, cb2)): ...
def accept_then_expect_line(sock, line, (cb1, cb2)): ...
def accept_then_expect_multiline_regex(sock, regex, cb): ...
def read_until(fd_or_sock, bytes, cb): ...
def read_all(fd_or_sock, cb): return self.read_until(fd_or_sock, EOF, cb)
def read_until_lineglob(fd_or_sock, cb): ...
def read_until_regex(fd_or_sock, cb): ...
def read_chunk(fd_or_sock, chunk_size, cb): ...
def write(fd_or_sock, buf, cb): ...
def write_then_expect_line(fd_or_sock, buf, (cb1, cb2)): ...
def connect_then_expect_line(..): ...
def connect_then_write_line(..): ...
def submit_work(callable, cb): ...
def run_once(..): """Run the event loop once."""
def run(..): """Keep running the event loop until exit."""
All methods always take at least one callback. Chained methods can take multiple callbacks (i.e. accept_then_expect_line()). You fill in the success, failure (both callables) and timeout (an int) slots. The engine will populate cb.cancel with a callable that you can call at any time to (try and) cancel the IO operation. (How quickly that works depends on the underlying implementation.)
I like this approach for two reasons: a) it allows platforms with great async support to work at their full potential, and b) it doesn't leak implementation details like non-blocking sockets, fds, multiplexing (poll/kqueue/select, IOCP, etc). Those are all details that are taken care of by the underlying implementation.
getaddrinfo is a good example here. Guido, in tulip, you have this implemented as:
def getaddrinfo(host, port, af=0, socktype=0, proto=0): infos = yield from scheduling.call_in_thread( socket.getaddrinfo, host, port, af, socktype, proto )
That's very implementation specific. It assumes the only way to perform an async getaddrinfo is by calling it from a separate thread. On Windows, there's native support for async getaddrinfo(), which we wouldn't be able to leverage here.
The biggest benefit is that no assumption is made as to how the asynchronicity is achieved. Note that I didn't mention IOCP or kqueue or epoll once. Those are all implementation details that the writer of an asynchronous Python app doesn't need to care about.
Thoughts?
Trent. _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
-- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
participants (7)
-
Antoine Pitrou
-
Christian Tismer
-
Greg Ewing
-
Guido van Rossum
-
Rene Nejsum
-
Steve Dower
-
Trent Nelson