Benefits of asyncio
Frank Millman
frank at chagford.com
Tue Jun 3 07:09:29 EDT 2014
"Chris Angelico" <rosuav at gmail.com> wrote in message
news:CAPTjJmqWkEStvrsrg30qjO+4TtLqfK9Q4GaByGovEw8NsdXzPg at mail.gmail.com...
>
> This works as long as your database is reasonably fast and close
> (common case for a lot of web servers: DB runs on same computer as web
> and application and etc servers). It's nice and simple, lets you use a
> single database connection (although you should probably wrap it in a
> try/finally to ensure that you roll back on any exception), and won't
> materially damage throughput as long as you don't run into problems.
> For a database driven web site, most of the I/O time will be waiting
> for clients, not waiting for your database.
>
> Getting rid of those blocking database calls means having multiple
> concurrent transactions on the database. Whether you go async or
> threaded, this is going to happen. Unless your database lets you run
> multiple simultaneous transactions on a single connection (I don't
> think the Python DB API allows that, and I can't think of any DB
> backends that support it, off hand), that means that every single
> concurrency point needs its own database connection. With threads, you
> could have a pool of (say) a dozen or so, one per thread, with each
> one working synchronously; with asyncio, you'd have to have one for
> every single incoming client request, or else faff around with
> semaphores and resource pools and such manually. The throughput you
> gain by making those asynchronous with callbacks is quite probably
> destroyed by the throughput you lose in having too many simultaneous
> connections to the database. I can't prove that, obviously, but I do
> know that PostgreSQL requires up-front RAM allocation based on the
> max_connections setting, and trying to support 5000 connections
> started to get kinda stupid.
>
I am following this with interest. I still struggle to get my head around
the concepts, but it is slowly coming clearer.
Focusing on PostgreSQL, couldn't you do the following?
PostgreSQL runs client/server (they call it front-end/back-end) over TCP/IP.
psycopg2 appears to have some support for async communication with the
back-end. I only skimmed the docs, and it looks a bit complicated, but it is
there.
So why not keep a 'connection pool', and for every potentially blocking
request, grab a connection, set up a callback or a 'yield from' to wait for
the response, and unblock.
Provided the requests return quickly, I would have thought a hundred
database connections could support thousands of users.
Frank Millman
More information about the Python-list
mailing list