Threading hang problems with requests module
John Levine
johnl at taugh.com
Mon Jan 13 20:06:25 EST 2020
I have a small script that goes down a list of domain names, does some
DNS lookups for santity checks, then if the checks look OK fetches
http://{domain}/ with requests.get() and looks at the text, if any,
returned.
When I run the checks in parallel with concurrent.futures, the script
inevitably hangs after a while, and when I kill it it's in thread
locks. A similar script just doing DNS lookups works fine so I don't
think the concurrent.futures part is wrong. Is this a known problem?
Here's what I'm doing in parallel, leaving out stuff unrelated to the
web fetches. Is there anything wrong here? Is this a known problem
with requests or httplib3? I'm running it on MacOS under 3.8.1 but
had the same problem under 3.7.4.
def lookup1(d):
""" lookup one domain
"""
ans = dict( ... stuff ...)
... various DNS tests ...
# try a web site
try:
r = requests.get(f"http://{d}/", timeout=webtimeout) # webtimeout is 10 seconds
except (requests.exceptions.ConnectionError, requests.exceptions.Timeout,
requests.exceptions.TooManyRedirects )as e:
print("no web",e)
ans['noweb'] = 1
return ans
except:
print("no web, no reason")
ans['noweb'] = 1
return ans
... various text comparisons on r.text ...
return ans
Here's the traceback when I kill the hung program with a couple of ^C
no web HTTPConnectionPool(host='apo-taxi.info', port=80): Max retries exceeded with url: / (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x1052df430>, 'Connection to apo-taxi.info timed out. (connect timeout=10)'))
[ long wait here, obviously hung ]
load: 1.58 cmd: Python 16548 waiting 7.97u 1.36s
^C^CTraceback (most recent call last):
File "tldtaste.py", line 195, in lkup
for future in concurrent.futures.as_completed(fl):
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/concurrent/futures/_base.py", line 244, in as_completed
waiter.event.wait(wait_timeout)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py", line 558, in wait
signaled = self._cond.wait(timeout)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py", line 302, in wait
waiter.acquire()
KeyboardInterrupt
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "tldtaste.py", line 311, in <module>
n = lkup(dl)
File "tldtaste.py", line 200, in lkup
print("thread barf", exc, file=sys.stderr)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/concurrent/futures/_base.py", line 636, in __exit__
self.shutdown(wait=True)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/concurrent/futures/thread.py", line 236, in shutdown
t.join()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py", line 1011, in join
self._wait_for_tstate_lock()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py", line 1027, in _wait_for_tstate_lock
elif lock.acquire(block, timeout):
KeyboardInterrupt
^CError in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/concurrent/futures/thread.py", line 40, in _python_exit
t.join()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py", line 1011, in join
self._wait_for_tstate_lock()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py", line 1027, in _wait_for_tstate_lock
elif lock.acquire(block, timeout):
KeyboardInterrupt
--
Regards,
John Levine, johnl at taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
More information about the Python-list
mailing list