[issue9090] Error code 10035 calling socket.recv() on a socket with a timeout (WSAEWOULDBLOCK - A non-blocking socket operation could not be completed immediately)

Eric Hohenstein report at bugs.python.org
Sun Jun 27 03:38:47 CEST 2010


New submission from Eric Hohenstein <ehohenstein at imvu.com>:

This error is unfortunately difficult to reproduce. I've only seen it happen on Windows XP running on a dual core VMWare VM. I haven't been able to reproduce it on a non-VM system running Windows 7. The only way I've been able to reproduce it is to run the following unit test repeatedly on the XP VM repeatedly until it fails:

import unittest
import urllib2

class DownloadUrlTest(unittest.TestCase):
    def testDownloadUrl(self):
        opener = urllib2.build_opener()
        handle = opener.open('http://localhost/', timeout=60)
        handle.info()
        data = handle.read()
        self.assertNotEqual(data, '')

if __name__ == "__main__":
    unittest.main()

This unit test obviously depends on a web server running on localhost. In the test environment where I was able to reproduce this problem the web server is Win32 Apache 2.0.54 with mod_php. When the test fails, it fails with Windows error code 10035 (WSAEWOULDBLOCK) being generated by the call to the recv() method rougly once every 50-100 times the test is run. The following is a the final entry in the stack when the error occurs:

  File "c:\slave\h05b15\build\Ext\Python26\lib\socket.py", line 353, in read (self=<socket._fileobject ...03B78170>, size=1027091)
    data = self._sock.recv(left)

The thing to note is that the socket is being created with a timeout of 60. The implementation of the socket.recv() method in socketmodule.c in the _socket import module is to use select() to wait for a socket to become readable for socket objects with a timeout and then to call recv() on the socket only if select() did not return indicating that the timeout period elapsed without the socket becoming readable. The fact that Windows error code 10035 (WSAEWOULDBLOCK) is being generated in the sock_recv_guts() method in socketmodule.c indicates that select() returned without timing out which means that Windows must have indicated that the socket is readable when in fact it wasn't. It appears that there is a known issue with Windows sockets where this type of problem may occur with non-blocking sockets. It is described in the msdn documentation for WSAAsyncSelect() (http://msdn.microsoft.com/en-us/library/ms741540%28VS.85%29.aspx). The code for socketmodule.c doesn't seem to handle this type of situation correctly. The patch I've included with this issue report retries the select() if the recv() call fails with WSAWOULDBLOCK (only if MS_WINDOWS is defined). With the patch in place the test ran approximately 23000 times without failure on the system where it was failing without the patch.

----------
components: IO, Windows
files: sock_recv.patch
keywords: patch
messages: 108770
nosy: ehohenstein
priority: normal
severity: normal
status: open
title: Error code 10035 calling socket.recv() on a socket with a timeout (WSAEWOULDBLOCK - A non-blocking socket operation could not be completed immediately)
type: behavior
versions: Python 2.6
Added file: http://bugs.python.org/file17780/sock_recv.patch

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9090>
_______________________________________


More information about the Python-bugs-list mailing list