
Matěj Týč <matej.tyc@gmail.com> wrote:
Does it mean that if you pass the numpy array to the child process using Queue, no significant amount of data will flow through it?
This is what my shared memory arrayes do.
Or I shouldn't pass it using Queue at all and just rely on inheritance?
This is what David Baddeley's shared memory arrays do.
Finally, I assume that passing it as an argument to the Process class is the worst option, because it will be pickled and unpickled.
My shared memory arrays only pickles the metadata, and can be used in this way.
Or maybe you refer to modules s.a. joblib that use this functionality and expose only a nice interface?
Joblib creates "share memory" by memory mapping a temporary file, which is back by RAM on Libux (tempfs). It is backed by a physical file on disk on Mac and Windows. In this resepect, joblib is much better on Linux than Mac or Windows.
And finally, cow means that returning large arrays still involves data moving between processes, whereas the shm approach has the workaround that you can preallocate the result array by the parent process, where the worker process can write to.
My shared memory arrays need no workaround dor this. They also allow shared memory arrays to be returned to the parent process. No preallocation is needed. Sturla