Dangerous default value for reuse_address in asyncio loop.create_datagram_endpoint()

When creating UDP servers with asyncio's create_datagram_endpoint(), the default value for reuse_address = True, resulting in a dangerous (and currently incorrectly documented) situation. I have proposed changing the default value but understandably such a change for a core library function parameter is not to be taken lightly. Thus I put this up for discussion on the list. As background, when creating TCP servers on UNIX-like systems, it is almost boilerplate to set SO_REUSEADDR for all server sockets to make sure that a restarting server can immediately bind the socket again. Without the SO_REUSEADDR sockopt, the kernel will hold the addr:port in a TIME_WAIT state for a while, preventing reuse. Thus, when creating TCP servers with loop.create_server(), the parameter reuse_address has a very reasonable default value of True. However things are very different in UDP-land. The kernel does not hold UDP server ports in a waiting state so the SO_REUSEADDR sockopt was repurposed in Linux (and *BSD afaik) to allow multiple processes to bind the SAME addr:port for a UDP server. The kernel will then feed incoming UDP packets to all such processes in a semi-fair-roundrobin manner. This is very useful in some scenarios, for example I've used it myself in C++ projects to allow UDP servers to be scaled easily and rolling upgrades to be implemented without separate load-balancing. But for this to be the default behaviour is quite dangerous. I discovered this default behaviour accidentally by having 2 separate Python programs (both doing SIP over UDP) accidentally configured to use the same UDP port. The result was that my 2 processes were indeed "sharing the load" - neither of them threw an exception at startup about the port being already in use and both started getting ~half of the incoming packets. So off to the docs I went and discovered that the documentation for create_datagram_endpoint() does not mention this behaviour at all, instead it mistakenly refers to the TCP protocol use of SO_REUSEADDR: "reuse_address tells the kernel to reuse a local socket in TIME_WAIT state, without waiting for its natural timeout to expire. If not specified will automatically be set to True on Unix." https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.create... What makes this default especially dangerous is, - Most people are not aware of this special mode that Linux allows for UDP sockets - Even if it was documented to be the default, many people would miss it unless a big warning was slapped on the docs - The problems are unlikely to appear in test scenarios and much more likely to pop up in production months or years after rolling out the code - If you have never used it on purpose, it is very confusing to debug, causing you to doubt your own and the kernel's sanity - The behaviour changes again if you happen to use a multicast address... Thus, my proposal is to change the default value for UDP to False or deprecate the function and introduce a new one as suggested by Yuri in my original bug report at: https://bugs.python.org/issue37228

On Tue., 19 Nov. 2019, 6:46 am , <vaizki@vaizki.fi> wrote:
Thus, my proposal is to change the default value for UDP to False or deprecate the function and introduce a new one as suggested by Yuri in my original bug report at: https://bugs.python.org/issue37228
Guido's suggestion on the ticket sounds like a sensible solution to me: * in 3.9, emit DeprecationWarning for cases where the default would currently be "True" * in 3.10, change the default to always be False It would also make sense to update the 3.8 documentation to mention the problematic default and its upcoming deprecation. Cheers, Nick.

SO_REUSEADDR was controversial also for socket.create_server(). In the end I concluded the best solution was to not expose a reuse_address parameter. See: https://github.com/python/cpython/blob/94e165096fd65e8237e60de570fb609604ab9... It must be noted that right now also asyncio's create_server() method allows passing reuse_address=True, which on Windows should probably be turned into a no-op. As for asyncio's create_datagram_endpoint() I partly agree with Antoine's solution. https://bugs.python.org/issue37228#msg357068 My course of action, though, would be the following: * in 3.8: turn reuse_address parameter into a no-op, update doc * in 3.9: raise error if reuse_address=True, update doc Note: differently from TCP / create_server(), with UDP you can set SO_REUSEADDR manually after calling create_datagram_endpoint() if you really want to. On Tue, Nov 19, 2019 at 4:48 AM <vaizki@vaizki.fi> wrote:
When creating UDP servers with asyncio's create_datagram_endpoint(), the default value for reuse_address = True, resulting in a dangerous (and currently incorrectly documented) situation. I have proposed changing the default value but understandably such a change for a core library function parameter is not to be taken lightly. Thus I put this up for discussion on the list.
As background, when creating TCP servers on UNIX-like systems, it is almost boilerplate to set SO_REUSEADDR for all server sockets to make sure that a restarting server can immediately bind the socket again. Without the SO_REUSEADDR sockopt, the kernel will hold the addr:port in a TIME_WAIT state for a while, preventing reuse. Thus, when creating TCP servers with loop.create_server(), the parameter reuse_address has a very reasonable default value of True.
However things are very different in UDP-land. The kernel does not hold UDP server ports in a waiting state so the SO_REUSEADDR sockopt was repurposed in Linux (and *BSD afaik) to allow multiple processes to bind the SAME addr:port for a UDP server. The kernel will then feed incoming UDP packets to all such processes in a semi-fair-roundrobin manner. This is very useful in some scenarios, for example I've used it myself in C++ projects to allow UDP servers to be scaled easily and rolling upgrades to be implemented without separate load-balancing. But for this to be the default behaviour is quite dangerous.
I discovered this default behaviour accidentally by having 2 separate Python programs (both doing SIP over UDP) accidentally configured to use the same UDP port. The result was that my 2 processes were indeed "sharing the load" - neither of them threw an exception at startup about the port being already in use and both started getting ~half of the incoming packets. So off to the docs I went and discovered that the documentation for create_datagram_endpoint() does not mention this behaviour at all, instead it mistakenly refers to the TCP protocol use of SO_REUSEADDR: "reuse_address tells the kernel to reuse a local socket in TIME_WAIT state, without waiting for its natural timeout to expire. If not specified will automatically be set to True on Unix."
https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.create...
What makes this default especially dangerous is, - Most people are not aware of this special mode that Linux allows for UDP sockets - Even if it was documented to be the default, many people would miss it unless a big warning was slapped on the docs - The problems are unlikely to appear in test scenarios and much more likely to pop up in production months or years after rolling out the code - If you have never used it on purpose, it is very confusing to debug, causing you to doubt your own and the kernel's sanity - The behaviour changes again if you happen to use a multicast address...
Thus, my proposal is to change the default value for UDP to False or deprecate the function and introduce a new one as suggested by Yuri in my original bug report at: https://bugs.python.org/issue37228 _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/TK2NTPWI... Code of Conduct: http://python.org/psf/codeofconduct/
-- Giampaolo - http://grodola.blogspot.com
participants (3)
-
Giampaolo Rodola'
-
Nick Coghlan
-
vaizki@vaizki.fi