New submission from Julian email@example.com:
Since Python 2.6, httplib has offered a timeout parameter for fetches. As the documentation explains, if this parameter is not provided, it uses the global default.
What the document doesn't explain is httplib builds on top of the socket library. The socket library has a default timeout of None (i.e. forever). This may be an appropriate default for general sockets, but it is a poor default for httplib; typical http clients would use a timeout in the 2-10 second range.
This problem is propagated up to urllib2, which sits on httplib, and further obscures that the default might be unsuitable.
From an inspection of the manuals, Python 3.0.1 suffers from the same problem except, the names have changed. urllib.response sits on http.client.
I, for one, made a brutal mistake of assuming that the "global default" would be some reasonable default for fetching web pages; I didn't have any specific timeout in mind, and was happy for the library to take care of it. Several million successful http downloads later, my server application thread froze waiting forever when talking to a recalcitrant web-server. I imagine others have fallen for the same trap.
While an ideal solution would be for httplib and http.client to use a more generally acceptable default, I can see it might be far too late to make such a change without breaking existing applications. Failing that, I would recommend that the documentation for httplib, urllib, urllib2, http.client and urllib.request (+ any other similar libraries sitting on socket? FTP, SMTP?) be changed to highlight that the default global timeout, sans deliberate override, is to wait a surprisingly long time.
---------- assignee: docs@python components: Documentation, Library (Lib) messages: 104763 nosy: docs@python, oddthinking priority: normal severity: normal status: open title: Unexpected default timeout in http-client-related libraries type: behavior versions: Python 2.6, Python 2.7, Python 3.1