[Python-checkins] bpo-30919: shared memory allocation performance regression in multiprocessing (#2708)

Antoine Pitrou webhook-mailer at python.org
Sun Jul 23 07:05:29 EDT 2017


https://github.com/python/cpython/commit/3051f0b78e53d1b771b49375dc139ca13f9fd76e
commit: 3051f0b78e53d1b771b49375dc139ca13f9fd76e
branch: master
author: Antoine Pitrou <pitrou at free.fr>
committer: GitHub <noreply at github.com>
date: 2017-07-23T13:05:26+02:00
summary:

bpo-30919: shared memory allocation performance regression in multiprocessing (#2708)

* Fix #30919: shared memory allocation performance regression in multiprocessing

* Change strategy for Arena directory choice

* Add blurb

files:
A Misc/NEWS.d/next/Library/2017-07-23-11-33-10.bpo-30919.5dYRru.rst
M Lib/multiprocessing/heap.py

diff --git a/Lib/multiprocessing/heap.py b/Lib/multiprocessing/heap.py
index 443321535ec..ee3ed551d0c 100644
--- a/Lib/multiprocessing/heap.py
+++ b/Lib/multiprocessing/heap.py
@@ -60,26 +60,32 @@ def __setstate__(self, state):
 else:
 
     class Arena(object):
+        if sys.platform == 'linux':
+            _dir_candidates = ['/dev/shm']
+        else:
+            _dir_candidates = []
 
         def __init__(self, size, fd=-1):
             self.size = size
             self.fd = fd
             if fd == -1:
                 self.fd, name = tempfile.mkstemp(
-                     prefix='pym-%d-'%os.getpid(), dir=util.get_temp_dir())
+                     prefix='pym-%d-'%os.getpid(),
+                     dir=self._choose_dir(size))
                 os.unlink(name)
                 util.Finalize(self, os.close, (self.fd,))
-                with open(self.fd, 'wb', closefd=False) as f:
-                    bs = 1024 * 1024
-                    if size >= bs:
-                        zeros = b'\0' * bs
-                        for _ in range(size // bs):
-                            f.write(zeros)
-                        del zeros
-                    f.write(b'\0' * (size % bs))
-                    assert f.tell() == size
+                os.ftruncate(self.fd, size)
             self.buffer = mmap.mmap(self.fd, self.size)
 
+        def _choose_dir(self, size):
+            # Choose a non-storage backed directory if possible,
+            # to improve performance
+            for d in self._dir_candidates:
+                st = os.statvfs(d)
+                if st.f_bavail * st.f_frsize >= size:  # enough free space?
+                    return d
+            return util.get_temp_dir()
+
     def reduce_arena(a):
         if a.fd == -1:
             raise ValueError('Arena is unpicklable because '
diff --git a/Misc/NEWS.d/next/Library/2017-07-23-11-33-10.bpo-30919.5dYRru.rst b/Misc/NEWS.d/next/Library/2017-07-23-11-33-10.bpo-30919.5dYRru.rst
new file mode 100644
index 00000000000..44c3a22bc85
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2017-07-23-11-33-10.bpo-30919.5dYRru.rst
@@ -0,0 +1,4 @@
+Fix shared memory performance regression in multiprocessing in 3.x.
+
+Shared memory used anonymous memory mappings in 2.x, while 3.x mmaps actual
+files. Try to be careful to do as little disk I/O as possible.



More information about the Python-checkins mailing list