[docs] [issue30145] Create a How to or Tutorial documentation for asyncio

Fri May 26 13:04:43 EDT 2017

Yury Selivanov added the comment:

A while ago, Eric Appelt stepped forward to help with the asyncio 
documentation.  Here's my response to his email in which we discuss
the direction for the new docs.

On May 24, 2017 at 1:04:56 PM, Eric Appelt (eric.appelt at gmail.com) wrote:
> Hi Yury,
> 
> I (briefly) went through the curio and trio tutorials and wanted to
> summarize a few thoughts. I'm going to go through them again slowly and
> actually do the exercises, but a few things that I found notable:
> 
> Both Curio and Trio start with an emphasis of tasks, spawning tasks at
> various points, and then later joining them. Trio even has a "nursery"
> async context manager where you spawn tasks inside and then they are all
> joined on exit. Following various exchanges, I understand that the
> coroutine is to be the preferred abstraction layer for use in the
> application developer, i.e. one would generally be using
> asyncio.gather(...) rather than create_task as the routine way of invoking
> concurrency. If that's correct, I would expect an asyncio tutorial to be a
> bit more focused on using gather, but I expect that manipulating tasks is
> of enough interest that it deserves some discussion.

What I like about Trio documentation is that it tries to explain 
async/await first, gradually paving the way to more advanced concepts
and the library API.

I’m going to propose a concrete structure for the revamped documentation.
It’s not something set in stone, let’s discuss and potentially 
restructure it:

1. Tutorial
 a/ Why async IO?
 b/ async/await in Python
 c/ Simple example in asyncio (Tasks + sleep) + explanation
 d/ An example of using aiohttp
 e/ How asyncio works and what is the event loop
 f/ Tasks, asyncio.gather, wait_for, cancellation

2. Advanced Tutorial
 a/ A short primer on network IO
 b/ Let’s implement a memcache driver using streams!
 c/ Let’s implement a memcache driver with transports!

3. API reference — there are a few things we’ll need to restructure
there.

In more detail:

1a - we need to answer the question of why would you even bother
learning asyncio and async/await.  What are the benefits?  When do
you want to use asyncio and when you really shouldn’t.

1b - we need to explain async/await in a friendly way.  In particular,
you don’t need to understand the details of the async/await protocol 
etc.  What you need to know is that there are two types of functions
in Python: regular functions and coroutines.  Coroutines can be
paused and resumed in: `await`, `async with`, and `async for` 
statements.  Fundamentally, this is all you need to know about
async/await in Python. You simply use `await` for coroutines, that’s
it.

1c - A simple example similar to
https://curio.readthedocs.io/en/latest/tutorial.html#getting-started

1d - Here I propose to just use aiohttp to implement a simple
web client or hello-world-server.  I think it can be a valuable
early tutorial point to show people a real example, something they
will actually need and use asyncio for.  It’s not a problem to
use some non-standard library in the tutorial.

1e - Here we’ll need to discuss some fundamental asyncio building
blocks: event loop, callbacks, Future, and Task.  Briefly, the point
is to get the user familiar with the terminology and give them a
high-level understanding of how things work.

1f - All high-level asyncio primitives that everyone needs to know:
Tasks, gather, wait_for, etc.  No more than 5 API functions should
be covered here, culminating in a simple example.

2a - Now we are entering the “advanced” part of the tutorial.  Let’s
explain sockets and selectors and how we tie them to callbacks using
the loop.

2b - I propose to take a simple protocol like Memcache or 
Redis and simply implement it using the streams API.  We’ll only
need two methods: set and get; and in the end we’ll teach the user
how things really work and how to design async APIs.

2c - Same as 1h but with Transports and Protocols.  Explain how
to decouple protocol parser from the IO and the benefits of that.
Show that the end result is faster than 1h, and not that much
more complex.

3 - We’ll need to restructure the current API reference a bit.  Things
that we will cover in the tutorial won’t be needed to be explained 
in detail again.  The idea is to get a simple to digest, concise
API reference.

Some things are currently documented in several different places. Like
event loops / policies, I believe we have three different pages 
covering them.  Some things aren’t needed to be fully covered at all,
like `AbstractEventLoop`.

Anyways, I think that ‘3’ is an easy part. The hard part is the
tutorial because it’s completely new.

> 
> Curio has a nice discussion of tasks that consume lots of CPU time
> crunching numbers, and the cooperative nature of coroutine multitasking,
> which I think is helpful.

Yes, I agree.  Maybe this should go to 1a.

> 
> Trio starts with a nice discussion of “regular” and “async” functions, good
> explanation of how coroutine objects get created and run.

We need this too: 1b.

> 
> Both Curio and Trio have a good discussion of their respective monitoring
> tools allowing the beginner to watch what is happening without peppering
> the code with too many prints and sleeps.

We don’t have this yet.  I think I saw a package on github 
(in aio-libs, IIRC) that implements this.  If it works we can use it in
the tutorial.

> 
> They both start with coroutines that sleep and print. Curio has a more
> entertaining example, and they both have the traditional socket echo
> server. On this point it seems somewhat unfortunate - IMO - from the
> perspective of an application developer print/sleep and socket examples are
> not the most compelling examples. The print/sleep examples aren't doing
> anything useful, and sockets are more low level than the typical
> application developer has to worry about. In my own professional
> experience, I have rarely had to use the python socket library - really
> just once to talk to HAProxy. What would be more interesting would be web
> client/servers, but this would rely on a third party library.
> 
> Curio has examples of a publish/subscribe system, and ends in a chat
> server. I really like this tutorial example as it highlights one of the
> most obvious situations where coroutines are useful - when you have a
> zillion connected clients doing nothing almost all of the time.

Yeah, this sounds entertaining.  Let’s have it too: 1d?

> 
> Somewhat related I gave a talk at PyTennessee on async/await/asyncio for
> novices and I submitted a tutorial to PyOhio that I might be able to draw
> from in this project. The one difference is that I'm able to rely on third
> party libraries (requests, aiohttp) and have a little more fun with real IO
> without having to get into sockets which I think is less familiar for most
> novices than simple http. Much like the curio example, I was playing with
> simple publish/subscribe examples possibly to use with the tutorial:
> https://gist.github.com/appeltel/fd3ddeeed6c330c7208502462639d2c9

This might be a bit too advanced for 1a-1f, we can have something like
this in the advanced tutorial.

> 
> I feel like now would be a good time to get started - is there already an
> outline of a new asyncio tutorial that I can help with, or is it worth
> taking some of these ideas from Curio and Trio and drafting out a prototype
> outline? Please let me know what you think.
> 

I’ll ask Brett if we can have a cpython-aiodocs fork under CPython 
organization.  This way we’ll be able to grant push privileges easily
to those who’s interested to work on this, and we’ll also have Issues/PRs
for workflow.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue30145>
_______________________________________