[New-bugs-announce] [issue30323] concurrent.futures.Executor.map() consumes all memory when big generators are used
Klamann
report at bugs.python.org
Tue May 9 17:37:02 EDT 2017
New submission from Klamann:
The Executor's map() function accepts a function and an iterable that holds the function arguments for each call to the function that should be made. This iterable could be a generator, and as such it could reference data that won't fit into memory.
The behaviour I would expect is that the Executor requests the next element from the iterable whenever a thread, process or whatever is ready to make the next function call.
But what actually happens is that the entire iterable gets converted into a list right after the map function is called and therefore any underlying generator will load all referenced data into memory. Here's where the list gets built from the iterable:
https://github.com/python/cpython/blob/3.6/Lib/concurrent/futures/_base.py#L548
The way I see it, there's no reason to convert the iterable to a list in the map function (or any other place in the Executor). Just replacing the list comprehension with a generator expression would probably fix that.
Here's an example that illustrates the issue:
from concurrent.futures import ThreadPoolExecutor
import time
def generate():
for i in range(10):
print("generating input", i)
yield i
def work(i):
print("working on input", i)
time.sleep(1)
with ThreadPoolExecutor(max_workers=2) as executor:
generator = generate()
executor.map(work, generator)
The output is:
generating input 0
working on input 0
generating input 1
working on input 1
generating input 2
generating input 3
generating input 4
generating input 5
generating input 6
generating input 7
generating input 8
generating input 9
working on input 2
working on input 3
working on input 4
working on input 5
working on input 6
working on input 7
working on input 8
working on input 9
Ideally, the lines should alternate, but currently all input is generated immediately.
----------
messages: 293353
nosy: Klamann
priority: normal
severity: normal
status: open
title: concurrent.futures.Executor.map() consumes all memory when big generators are used
type: resource usage
versions: Python 3.3, Python 3.4, Python 3.5, Python 3.6, Python 3.7
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue30323>
_______________________________________
More information about the New-bugs-announce
mailing list