[Python-Dev] mysterious hangs in socket code
Jeremy Hylton
jeremy@alum.mit.edu
Tue, 3 Sep 2002 17:53:46 -0400
I've been running a small, multi-threaded program to retrieve web
pages today. The entire program appears to hang when I perform a slow
DNS operation, even there is no application-level coordinate between
the threads.
The motivation comes from http://www.python.org/sf/591349, but I ended
up writing a similar small test script, which I've attached.
When I run this program with Python 2.1, it produces a steady stream
of output -- urls and the time it took to load them. Most of the
pages take less than a second, but some take a very long time.
If I run this program with Python 2.2 or 2.3, it produces little
bursts of output, then pauses for a long time, then repeats.
I believe that the problem relates to DNS lookups, but not in a way I
fully understand. If I connect gdb to any of the threads while the
program is hung, it is always inside getaddrinfo(). My first
realization was that the socketmodule stopped wrapping DNS lookups in
By_BEGIN/END_ALLOW_THREADS calls when the IPv6 changes were
integrated. But if I restore these calls --
see http://www.python.org/sf/604210 --
I don't see any change in behavior. The program still hangs
periodically.
One possibility is that the Linux getaddrinfo() is thread-safe, but
only by way of a lock that only allows one request to be outstanding
at a time.
Not sure what the other possibilities are, but the current behavior is
awful.
Jeremy
---------------------------------------------------------------------
import httplib
import Queue
import random
import sys
import threading
import time
import traceback
import urlparse
headers = {"Accept":
"text/plain, text/html, image/jpeg, image/jpg, "
"image/gif, image/png, */*"}
class URLThread(threading.Thread):
def __init__(self, queue):
threading.Thread.__init__(self)
self._queue = queue
self._stopevent = threading.Event()
def stop(self):
self._stopevent.set()
def run(self):
while not self._stopevent.isSet():
self.fetch()
def fetch(self):
url = self._queue.get()
t0 = time.time()
try:
self._fetch(url)
except:
etype, value, tb = sys.exc_info()
L = ["Error occurred fetching %s\n" % url,
"%s: %s\n" % (etype, value),
]
L += traceback.format_tb(tb)
sys.stderr.write("".join(L))
t1 = time.time()
print url, round(t1 - t0, 2)
def _fetch(self, url):
parts = urlparse.urlparse(url)
host = parts[1]
path = parts[2]
h = httplib.HTTPConnection(host)
h.connect()
h.request("GET", path, headers=headers)
r = h.getresponse()
r.read()
h.close()
urls = """\
http://www.andersen.com/
http://www.google.com/
http://www.google.com/images/logo.gif
http://www.microsoft.com/
http://www.microsoft.com/homepage/gif/bnr-microsoft.gif
http://www.microsoft.com/homepage/gif/1ptrans.gif
http://www.microsoft.com/library/toolbar/images/curve.gif
http://www.yahoo.com/
http://www.sourceforge.net/
http://www.slashdot.org/
http://www.kuro5hin.org/
http://www.intel.com/
http://www.aol.com/
http://www.amazon.com/
http://www.cnn.com/
http://money.cnn.com/
http://www.expedia.com/
http://www.tripod.com/
http://www.hotmail.com/
http://www.angelfire.com/
http://www.excite.com/
http://www.verisign.com/
http://www.riaa.com/
http://www.enron.com/
http://www.securityspace.com/
http://www.directv.com/
http://www.att.com/
http://www.qwest.com/
http://www.covad.com/
http://www.sprint.com/
http://www.mci.com/
http://www.worldcom.com/
"""
urls = [u for u in urls.split("\n") if u]
REPEAT = 10
THREADS = 8
class RandomQueue:
def __init__(self, L):
self.list = L
def get(self):
return random.choice(self.list)
if __name__ == "__main__":
urlq = RandomQueue(urls)
sys.setcheckinterval(10)
threads = []
for i in range(THREADS):
t = URLThread(urlq)
t.start()
threads.append(t)
while 1:
try:
time.sleep(30)
except:
break
print "Shutting down threads..."
for t in threads:
t.stop()
for t in threads:
t.join()