[Python-checkins] bpo-30919: shared memory allocation performance regression in multiprocessing (#2708)
Antoine Pitrou
webhook-mailer at python.org
Sun Jul 23 07:05:29 EDT 2017
https://github.com/python/cpython/commit/3051f0b78e53d1b771b49375dc139ca13f9fd76e
commit: 3051f0b78e53d1b771b49375dc139ca13f9fd76e
branch: master
author: Antoine Pitrou <pitrou at free.fr>
committer: GitHub <noreply at github.com>
date: 2017-07-23T13:05:26+02:00
summary:
bpo-30919: shared memory allocation performance regression in multiprocessing (#2708)
* Fix #30919: shared memory allocation performance regression in multiprocessing
* Change strategy for Arena directory choice
* Add blurb
files:
A Misc/NEWS.d/next/Library/2017-07-23-11-33-10.bpo-30919.5dYRru.rst
M Lib/multiprocessing/heap.py
diff --git a/Lib/multiprocessing/heap.py b/Lib/multiprocessing/heap.py
index 443321535ec..ee3ed551d0c 100644
--- a/Lib/multiprocessing/heap.py
+++ b/Lib/multiprocessing/heap.py
@@ -60,26 +60,32 @@ def __setstate__(self, state):
else:
class Arena(object):
+ if sys.platform == 'linux':
+ _dir_candidates = ['/dev/shm']
+ else:
+ _dir_candidates = []
def __init__(self, size, fd=-1):
self.size = size
self.fd = fd
if fd == -1:
self.fd, name = tempfile.mkstemp(
- prefix='pym-%d-'%os.getpid(), dir=util.get_temp_dir())
+ prefix='pym-%d-'%os.getpid(),
+ dir=self._choose_dir(size))
os.unlink(name)
util.Finalize(self, os.close, (self.fd,))
- with open(self.fd, 'wb', closefd=False) as f:
- bs = 1024 * 1024
- if size >= bs:
- zeros = b'\0' * bs
- for _ in range(size // bs):
- f.write(zeros)
- del zeros
- f.write(b'\0' * (size % bs))
- assert f.tell() == size
+ os.ftruncate(self.fd, size)
self.buffer = mmap.mmap(self.fd, self.size)
+ def _choose_dir(self, size):
+ # Choose a non-storage backed directory if possible,
+ # to improve performance
+ for d in self._dir_candidates:
+ st = os.statvfs(d)
+ if st.f_bavail * st.f_frsize >= size: # enough free space?
+ return d
+ return util.get_temp_dir()
+
def reduce_arena(a):
if a.fd == -1:
raise ValueError('Arena is unpicklable because '
diff --git a/Misc/NEWS.d/next/Library/2017-07-23-11-33-10.bpo-30919.5dYRru.rst b/Misc/NEWS.d/next/Library/2017-07-23-11-33-10.bpo-30919.5dYRru.rst
new file mode 100644
index 00000000000..44c3a22bc85
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2017-07-23-11-33-10.bpo-30919.5dYRru.rst
@@ -0,0 +1,4 @@
+Fix shared memory performance regression in multiprocessing in 3.x.
+
+Shared memory used anonymous memory mappings in 2.x, while 3.x mmaps actual
+files. Try to be careful to do as little disk I/O as possible.
More information about the Python-checkins
mailing list