Question about asyncio and blocking operations
Ian Kelly
ian.g.kelly at gmail.com
Sat Jan 23 10:44:33 EST 2016
On Sat, Jan 23, 2016 at 7:38 AM, Frank Millman <frank at chagford.com> wrote:
> Here is the difficulty. The recommended way to handle a blocking operation
> is to run it as task in a different thread, using run_in_executor(). This
> method is a coroutine. An implication of this is that any method that calls
> it must also be a coroutine, so I end up with a chain of coroutines
> stretching all the way back to the initial event that triggered it.
This seems to be a common misapprehension about asyncio programming.
While coroutines are the focus of the library, they're based on
futures, and so by working at a slightly lower level you can also
handle them as such. So while this would be the typical way to use
run_in_executor:
async def my_coroutine(stuff):
value = await get_event_loop().run_in_executor(None,
blocking_function, stuff)
result = await do_something_else_with(value)
return result
This is also a perfectly valid way to use it:
def normal_function(stuff):
loop = get_event_loop()
coro = loop.run_in_executor(None, blocking_function, stuff)
task = loop.create_task(coro)
task.add_done_callback(do_something_else)
return task
> I use a cache to store frequently used objects, but I wait for the first
> request before I actually retrieve it from the database. This is how it
> worked -
>
> # cache of database objects for each company
> class DbObject(dict):
> def __missing__(self, company):
> db_object = self[company] = get_db_object _from_database()
> return db_object
> db_objects = DbObjects()
>
> Any function could ask for db_cache.db_objects[company]. The first time it
> would be read from the database, on subsequent requests it would be returned
> from the dictionary.
>
> Now get_db_object_from_database() is a coroutine, so I have to change it to
> db_object = self[company] = await get_db_object _from_database()
>
> But that is not allowed, because __missing__() is not a coroutine.
>
> I fixed it by replacing the cache with a function -
>
> # cache of database objects for each company
> db_objects = {}
> async def get_db_object(company):
> if company not in db_objects:
> db_object = db_objects[company] = await get_db_object
> _from_database()
> return db_objects[company]
>
> Now the calling functions have to call 'await
> db_cache.get_db_object(company)'
>
> Ok, once I had made the change it did not feel so bad.
This all sounds pretty reasonable to me.
> Now I have another problem. I have some classes which retrieve some data
> from the database during their __init__() method. I find that it is not
> allowed to call a coroutine from __init__(), and it is not allowed to turn
> __init__() into a coroutine.
>
> I imagine that I will have to split __init__() into two parts, put the
> database functionality into a separately-callable method, and then go
> through my app to find all occurrences of instantiating the object and
> follow it with an explicit call to the new method.
>
> Again, I can handle that without too much difficulty. But at this stage I do
> not know what other problems I am going to face, and how easy they will be
> to fix.
>
> So I thought I would ask here if anyone has been through a similar exercise,
> and if what I am going through sounds normal, or if I am doing something
> fundamentally wrong.
This is where it would make sense to me to use callbacks instead of
subroutines. You can structure your __init__ method like this:
def __init__(self, params):
self.params = params
self.db_object_future = get_event_loop().create_task(
get_db_object(params))
async def method_depending_on_db_object():
db_object = await self.db_object_future
result = do_something_with(db_object)
return result
The caveat with this is that while __init__ itself doesn't need to be
a coroutine, any method that depends on the DB lookup does need to be
(or at least needs to return a future).
More information about the Python-list
mailing list