[Python-ideas] An async facade? (was Re: [Python-Dev] Socket timeout and completion based sockets)

Trent Nelson trent at snakebite.org
Fri Nov 30 17:14:22 CET 2012


    [ It's tough coming up with unique subjects for these async
      discussions.  I've dropped python-dev and cc'd python-ideas
      instead as the stuff below follows on from the recent msgs. ]

    TL;DR version:

        Provide an async interface that is implicitly asynchronous;
        all calls return immediately, callbacks are used to handle
        success/error/timeout.

            class async:
                def accept():
                def read():
                def write():
                def getaddrinfo():
                def submit_work():

        How the asynchronicity (not a word, I know) is achieved is
        an implementation detail, and will differ for each platform.

        (Windows will be able to leverage all its async APIs to full
         extent, Linux et al can keep mimicking asynchronicity via
         the usual non-blocking + multiplexing (poll/kqueue etc),
         thread pools, etc.)


On Wed, Nov 28, 2012 at 11:15:07AM -0800, Glyph wrote:
>    On Nov 28, 2012, at 12:04 PM, Guido van Rossum <guido at python.org> wrote:
>    I would also like to bring up <https://github.com/lvh/async-pep> again.

    So, I spent yesterday working on the IOCP/async stuff.  The saw this
    PEP and the sample async/abstract.py.  That got me thinking: why don't
    we have a low-level async facade/API?  Something where all calls are
    implicitly asynchronous.

    On systems with extensive support for asynchronous 'stuff', primarily
    Windows and AIX/Solaris to a lesser extent, we'd be able to leverage
    the platform-provided async facilities to full effect.

    On other platforms, we'd fake it, just like we do now, with select,
    poll/epoll, kqueue and non-blocking sockets.

    Consider the following:

        class Callback:
            __slots__ = [
                'success',
                'failure',
                'timeout',
                'cancel',
            ]

        class AsyncEngine:
            def getaddrinfo(host, port, ..., cb):
                ...

            def getaddrinfo_then_connect(.., callbacks=(cb1, cb2))
                ...

            def accept(sock, cb):
                ...

            def accept_then_write(sock, buf, (cb1, cb2)):
                ...

            def accept_then_expect_line(sock, line, (cb1, cb2)):
                ...

            def accept_then_expect_multiline_regex(sock, regex, cb):
                ...

            def read_until(fd_or_sock, bytes, cb):
                ...

            def read_all(fd_or_sock, cb):
                return self.read_until(fd_or_sock, EOF, cb)

            def read_until_lineglob(fd_or_sock, cb):
                ...

            def read_until_regex(fd_or_sock, cb):
                ...

            def read_chunk(fd_or_sock, chunk_size, cb):
                ...

            def write(fd_or_sock, buf, cb):
                ...

            def write_then_expect_line(fd_or_sock, buf, (cb1, cb2)):
                ...

            def connect_then_expect_line(..):
                ...

            def connect_then_write_line(..):
                ...

            def submit_work(callable, cb):
                ...

            def run_once(..):
                """Run the event loop once."""

            def run(..):
                """Keep running the event loop until exit."""

    All methods always take at least one callback.  Chained methods can
    take multiple callbacks (i.e. accept_then_expect_line()).  You fill
    in the success, failure (both callables) and timeout (an int) slots.
    The engine will populate cb.cancel with a callable that you can call
    at any time to (try and) cancel the IO operation.  (How quickly that
    works depends on the underlying implementation.)

    I like this approach for two reasons: a) it allows platforms with
    great async support to work at their full potential, and b) it
    doesn't leak implementation details like non-blocking sockets, fds,
    multiplexing (poll/kqueue/select, IOCP, etc).  Those are all details
    that are taken care of by the underlying implementation.

    getaddrinfo is a good example here.  Guido, in tulip, you have this
    implemented as:

        def getaddrinfo(host, port, af=0, socktype=0, proto=0):
            infos = yield from scheduling.call_in_thread(
                socket.getaddrinfo,
                host, port, af,
                socktype, proto
            )

    That's very implementation specific.  It assumes the only way to
    perform an async getaddrinfo is by calling it from a separate
    thread.  On Windows, there's native support for async getaddrinfo(),
    which we wouldn't be able to leverage here.

    The biggest benefit is that no assumption is made as to how the
    asynchronicity is achieved.  Note that I didn't mention IOCP or
    kqueue or epoll once.  Those are all implementation details that
    the writer of an asynchronous Python app doesn't need to care about.

    Thoughts?

        Trent.



More information about the Python-ideas mailing list