[New-bugs-announce] [issue27223] _ready_ready and _write_ready should respect _conn_lost

Łukasz Langa report at bugs.python.org
Sat Jun 4 21:20:33 EDT 2016

New submission from Łukasz Langa:

Currently _read_ready and _write_ready callbacks do not respect _conn_lost, which makes them vulnerable to the following race condition:
- the client protocol connects to a server
- the client protocol sends some data
- the server sends some data back
- in the mean time, something causes the connection to be lost and that information gets handled by the client protocol first
- the client transport receives _read_ready but now the connection is already marked as lost and cleared on the transport

This causes an ugly exception to be raised:

	Exception in callback _SelectorSocketTransport._read_ready()
	handle: <Handle _SelectorSocketTransport._read_ready()>
	Traceback (most recent call last):
	File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/asyncio/selector_events.py", line 664, in _read_ready
		data = self._sock.recv(self.max_size)
	AttributeError: 'NoneType' object has no attribute 'recv'

	During handling of the above exception, another exception occurred:

	Traceback (most recent call last):
	File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/asyncio/events.py", line 125, in _run
	File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/asyncio/selector_events.py", line 668, in _read_ready
		self._fatal_error(exc, 'Fatal read error on socket transport')
	File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/asyncio/selector_events.py", line 587, in _fatal_error
	AttributeError: 'NoneType' object has no attribute 'call_exception_handler'

This can be reproduced by running srv.py and cli.py attached here. Since this is an event loop timing issue, it might take a few tries of cli.py to the issue to surface. I get it to repro more consistently with PYTHONASYNCIODEBUG=1.

In production I get this issue most often when there's a torrent of simultaneous connections being created while existing protocol connections are already being handled.

To fix this, I propose to add guards for _conn_lost on both _ready_ready and _write_ready. See attached patch. This doesn't break any existing assumptions because we can't really respond to I/O events on a connection that's reset.

The patch also changes 3 existing unit tests, which I believe are currently invalid. test_write_ready_exception_close simulates filling the buffer and calling _write_ready after the transport has been closed. This means _call_connection_lost has been called and now _sock is None and _loop is None. We'd have AttributeError raised (like in the traceback above) and not OSError like the test believes. That would cause the same except clause to be invoked but then (like in the traceback above) _fatal_error fails due to _loop being None. I renamed the test test_transport_close_remove_writer and removed the _write_ready assumptions in it.

test_write_ready_send_closing required a change so that the buffer is non-empty BEFORE transport.close() is called. In this case close() doesn't remove the writer and schedule connection_lost. Instead, this is done by _write_ready after the write has been sent out. I also removed the following assert: `self.assertFalse(self.loop.writers)` because it's true during the entire execution of the test (in other words, checks nothing).

test_write_ready_send_closing_empty_buffer checks for _call_connection_lost to be called but in fact _call_connection_lost is scheduled to be called_soon in transport.close(). I changed the test to reflect this.

assignee: lukasz.langa
components: asyncio
messages: 267360
nosy: asvetlov, gvanrossum, haypo, lukasz.langa, yselivanov
priority: normal
severity: normal
status: open
title: _ready_ready and _write_ready should respect _conn_lost
versions: Python 3.5, Python 3.6

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list