[Python-checkins] bpo-35279: reduce default max_workers of ThreadPoolExecutor (GH-13618)

Inada Naoki webhook-mailer at python.org
Tue May 28 08:03:00 EDT 2019


https://github.com/python/cpython/commit/9a7e5b1b42abcedb895b1ce49d83fe067d01835c
commit: 9a7e5b1b42abcedb895b1ce49d83fe067d01835c
branch: master
author: Inada Naoki <songofacandy at gmail.com>
committer: GitHub <noreply at github.com>
date: 2019-05-28T21:02:52+09:00
summary:

bpo-35279: reduce default max_workers of ThreadPoolExecutor (GH-13618)

files:
A Misc/NEWS.d/next/Library/2019-05-28-19-14-29.bpo-35279.PX7yl9.rst
M Doc/library/concurrent.futures.rst
M Lib/concurrent/futures/thread.py
M Lib/test/test_concurrent_futures.py

diff --git a/Doc/library/concurrent.futures.rst b/Doc/library/concurrent.futures.rst
index ffc29d782ec0..f2491dd24571 100644
--- a/Doc/library/concurrent.futures.rst
+++ b/Doc/library/concurrent.futures.rst
@@ -159,6 +159,15 @@ And::
    .. versionchanged:: 3.7
       Added the *initializer* and *initargs* arguments.
 
+   .. versionchanged:: 3.8
+      Default value of *max_workers* is changed to ``min(32, os.cpu_count() + 4)``.
+      This default value preserves at least 5 workers for I/O bound tasks.
+      It utilizes at most 32 CPU cores for CPU bound tasks which release the GIL.
+      And it avoids using very large resources implicitly on many-core machines.
+
+      ThreadPoolExecutor now reuses idle worker threads before starting
+      *max_workers* worker threads too.
+
 
 .. _threadpoolexecutor-example:
 
diff --git a/Lib/concurrent/futures/thread.py b/Lib/concurrent/futures/thread.py
index ad6b4c20b566..2426e94de91f 100644
--- a/Lib/concurrent/futures/thread.py
+++ b/Lib/concurrent/futures/thread.py
@@ -129,9 +129,14 @@ def __init__(self, max_workers=None, thread_name_prefix='',
             initargs: A tuple of arguments to pass to the initializer.
         """
         if max_workers is None:
-            # Use this number because ThreadPoolExecutor is often
-            # used to overlap I/O instead of CPU work.
-            max_workers = (os.cpu_count() or 1) * 5
+            # ThreadPoolExecutor is often used to:
+            # * CPU bound task which releases GIL
+            # * I/O bound task (which releases GIL, of course)
+            #
+            # We use cpu_count + 4 for both types of tasks.
+            # But we limit it to 32 to avoid consuming surprisingly large resource
+            # on many core machine.
+            max_workers = min(32, (os.cpu_count() or 1) + 4)
         if max_workers <= 0:
             raise ValueError("max_workers must be greater than 0")
 
diff --git a/Lib/test/test_concurrent_futures.py b/Lib/test/test_concurrent_futures.py
index de6ad8f2aa12..b27ae7194822 100644
--- a/Lib/test/test_concurrent_futures.py
+++ b/Lib/test/test_concurrent_futures.py
@@ -755,8 +755,8 @@ def record_finished(n):
 
     def test_default_workers(self):
         executor = self.executor_type()
-        self.assertEqual(executor._max_workers,
-                         (os.cpu_count() or 1) * 5)
+        expected = min(32, (os.cpu_count() or 1) + 4)
+        self.assertEqual(executor._max_workers, expected)
 
     def test_saturation(self):
         executor = self.executor_type(4)
diff --git a/Misc/NEWS.d/next/Library/2019-05-28-19-14-29.bpo-35279.PX7yl9.rst b/Misc/NEWS.d/next/Library/2019-05-28-19-14-29.bpo-35279.PX7yl9.rst
new file mode 100644
index 000000000000..41ee5c2fe8bf
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2019-05-28-19-14-29.bpo-35279.PX7yl9.rst
@@ -0,0 +1,3 @@
+Change default *max_workers* of ``ThreadPoolExecutor`` from ``cpu_count() *
+5`` to ``min(32, cpu_count() + 4))``.  Previous value was unreasonably
+large on many cores machines.



More information about the Python-checkins mailing list