[New-bugs-announce] [issue19675] Pool dies with excessive workers, but does not cleanup
Dustin Oprea
report at bugs.python.org
Thu Nov 21 02:52:19 CET 2013
New submission from Dustin Oprea:
If you provide a number of processes to a Pool that the OS can't fulfill, Pool will raise an OSError and die, but does not cleanup any of the processes that it has forked.
This is a session in Python where I can allocate a large, but fulfillable, number of processes (just to exhibit what's possible in my current system):
>>> from multiprocessing import Pool
>>> p = Pool(500)
>>> p.close()
>>> p.join()
Now, this is a request that will fail. However, even after this fails, I can't allocate even a single worker:
>>> p = Pool(700)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/__init__.py", line 232, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 159, in __init__
self._repopulate_pool()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 222, in _repopulate_pool
w.start()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/process.py", line 130, in start
self._popen = Popen(self)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/forking.py", line 121, in __init__
self.pid = os.fork()
OSError: [Errno 35] Resource temporarily unavailable
>>> p = Pool(1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/__init__.py", line 232, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 159, in __init__
self._repopulate_pool()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 222, in _repopulate_pool
w.start()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/process.py", line 130, in start
self._popen = Popen(self)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/forking.py", line 121, in __init__
self.pid = os.fork()
OSError: [Errno 35] Resource temporarily unavailable
The only way to clean this up is to close the parent (the interpreter).
I'm submitting a patch for 2.7.6 that intercepts exceptions and cleans-up the workers before bubbling. The affected method is _repopulate_pool(), and appears to be the same in 2.7.6, 3.3.3, and probably every other recent version of Python.
This is the old version:
for i in range(self._processes - len(self._pool)):
w = self.Process(target=worker,
args=(self._inqueue, self._outqueue,
self._initializer,
self._initargs, self._maxtasksperchild)
)
self._pool.append(w)
w.name = w.name.replace('Process', 'PoolWorker')
w.daemon = True
w.start()
debug('added worker')
This is the new version:
try:
for i in range(self._processes - len(self._pool)):
w = self.Process(target=worker,
args=(self._inqueue, self._outqueue,
self._initializer,
self._initargs, self._maxtasksperchild)
)
self._pool.append(w)
w.name = w.name.replace('Process', 'PoolWorker')
w.daemon = True
w.start()
debug('added worker')
except:
debug("Process creation error. Cleaning-up (%d) workers." % (len(self._pool)))
for process in self._pool:
if process.is_alive() is False:
continue
process.terminate()
process.join()
debug("Processing cleaning-up. Bubbling error.")
raise
This is what happens, now: I can go from requesting a number that's too high to immediately requesting one that's also high but within limits, and there's now no problem as all resources have been freed:
>>> from multiprocessing import Pool
>>> p = Pool(700)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/__init__.py", line 232, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 159, in __init__
self._repopulate_pool()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 224, in _repopulate_pool
w.start()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/process.py", line 130, in start
self._popen = Popen(self)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/forking.py", line 121, in __init__
self.pid = os.fork()
OSError: [Errno 35] Resource temporarily unavailable
>>> p = Pool(500)
>>> p.close()
>>> p.join()
----------
components: Library (Lib)
files: pool.py.patch_2.7.6_20131120-1959
messages: 203556
nosy: dsoprea
priority: normal
severity: normal
status: open
title: Pool dies with excessive workers, but does not cleanup
type: resource usage
versions: Python 2.7, Python 3.3
Added file: http://bugs.python.org/file32742/pool.py.patch_2.7.6_20131120-1959
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19675>
_______________________________________
More information about the New-bugs-announce
mailing list