<div dir="ltr"><div class="">
<p>When trying to run lzma in parallel (see the code below) it
hangs for a very long time. The non-parallel version of the code using
map() works fine as shown in the code below.</p>
<p>Python 3.3.2
[GCC 4.6.3] on linux</p>
<pre style class=""><code><span class="">import</span><span class=""> lzma
</span><span class="">from</span><span class=""> functools </span><span class="">import</span><span class=""> partial
</span><span class="">import</span><span class=""> multiprocessing
</span><span class="">def</span><span class=""> run_lzma</span><span class="">(</span><span class="">data</span><span class="">,</span><span class="">c</span><span class="">):</span><span class="">
</span><span class="">return</span><span class=""> c</span><span class="">.</span><span class="">compress</span><span class="">(</span><span class="">data</span><span class="">)</span><span class="">
</span><span class="">def</span><span class=""> split_len</span><span class="">(</span><span class="">seq</span><span class="">,</span><span class=""> length</span><span class="">):</span><span class="">
</span><span class="">return</span><span class=""> </span><span class="">[</span><span class="">str</span><span class="">.</span><span class="">encode</span><span class="">(</span><span class="">seq</span><span class="">[</span><span class="">i</span><span class="">:</span><span class="">i</span><span class="">+</span><span class="">length</span><span class="">])</span><span class=""> </span><span class="">for</span><span class=""> i </span><span class="">in</span><span class=""> range</span><span class="">(</span><span class="">0</span><span class="">,</span><span class=""> len</span><span class="">(</span><span class="">seq</span><span class="">),</span><span class=""> length</span><span class="">)]</span><span class="">
</span><span class="">def</span><span class=""> lzma_mp</span><span class="">(</span><span class="">sequence</span><span class="">,</span><span class="">threads</span><span class="">=</span><span class="">3</span><span class="">):</span><span class="">
lzc </span><span class="">=</span><span class=""> lzma</span><span class="">.</span><span class="">LZMACompressor</span><span class="">()</span><span class="">
blocksize </span><span class="">=</span><span class=""> int</span><span class="">(</span><span class="">round</span><span class="">(</span><span class="">len</span><span class="">(</span><span class="">sequence</span><span class="">)/</span><span class="">threads</span><span class="">))</span><span class="">
strings </span><span class="">=</span><span class=""> split_len</span><span class="">(</span><span class="">sequence</span><span class="">,</span><span class=""> blocksize</span><span class="">)</span><span class="">
lzc_partial </span><span class="">=</span><span class=""> partial</span><span class="">(</span><span class="">run_lzma</span><span class="">,</span><span class="">c</span><span class="">=</span><span class="">lzc</span><span class="">)</span><span class="">
pool</span><span class="">=</span><span class="">multiprocessing</span><span class="">.</span><span class="">Pool</span><span class="">()</span><span class="">
lzc_pool </span><span class="">=</span><span class=""> list</span><span class="">(</span><span class="">pool</span><span class="">.</span><span class="">map</span><span class="">(</span><span class="">lzc_partial</span><span class="">,</span><span class="">strings</span><span class="">))</span><span class="">
pool</span><span class="">.</span><span class="">close</span><span class="">()</span><span class="">
pool</span><span class="">.</span><span class="">join</span><span class="">()</span><span class="">
out_flush </span><span class="">=</span><span class=""> lzc</span><span class="">.</span><span class="">flush</span><span class="">()</span><span class="">
</span><span class="">return</span><span class=""> b</span><span class="">""</span><span class="">.</span><span class="">join</span><span class="">(</span><span class="">lzc_pool </span><span class="">+</span><span class=""> </span><span class="">[</span><span class="">out_flush</span><span class="">])</span><span class="">
sequence </span><span class="">=</span><span class=""> </span><span class="">'AAAAAJKDDDDDDDDDDDDDDDDDDDDDDDDDDDDGJFKSHFKLHALWEHAIHWEOIAH IOAHIOWEHIOHEIOFEAFEASFEAFWEWWWWWWWWWWWWWWWWWWWWWWWWWWWWWEWFQWEWQWQGEWQFEWFDWEWEGEFGWEG'</span><span class="">
lzma_mp</span><span class="">(</span><span class="">sequence</span><span class="">,</span><span class="">threads</span><span class="">=</span><span class="">3</span><span class="">)</span></code></pre>
<p>When using lzma and the map function it works fine.</p>
<pre style class=""><code><span class="">threads</span><span class="">=</span><span class="">3</span><span class="">
blocksize </span><span class="">=</span><span class=""> int</span><span class="">(</span><span class="">round</span><span class="">(</span><span class="">len</span><span class="">(</span><span class="">sequence</span><span class="">)/</span><span class="">threads</span><span class="">))</span><span class="">
strings </span><span class="">=</span><span class=""> split_len</span><span class="">(</span><span class="">sequence</span><span class="">,</span><span class=""> blocksize</span><span class="">)</span><span class="">
lzc </span><span class="">=</span><span class=""> lzma</span><span class="">.</span><span class="">LZMACompressor</span><span class="">()</span><span class="">
out </span><span class="">=</span><span class=""> list</span><span class="">(</span><span class="">map</span><span class="">(</span><span class="">lzc</span><span class="">.</span><span class="">compress</span><span class="">,</span><span class="">strings</span><span class="">))</span><span class="">
out_flush </span><span class="">=</span><span class=""> lzc</span><span class="">.</span><span class="">flush</span><span class="">()</span><span class="">
result </span><span class="">=</span><span class=""> b</span><span class="">""</span><span class="">.</span><span class="">join</span><span class="">(</span><span class="">out </span><span class="">+</span><span class=""> </span><span class="">[</span><span class="">out_flush</span><span class="">])</span><span class="">
lzma</span><span class="">.</span><span class="">compress</span><span class="">(</span><span class="">str</span><span class="">.</span><span class="">encode</span><span class="">(</span><span class="">sequence</span><span class="">))</span><span class="">
lzma</span><span class="">.</span><span class="">compress</span><span class="">(</span><span class="">str</span><span class="">.</span><span class="">encode</span><span class="">(</span><span class="">sequence</span><span class="">))</span><span class=""> </span><span class="">==</span><span class=""> result</span></code></pre>
<p>Map using partial function works fine as well.</p>
<pre style class=""><code><span class="">lzc </span><span class="">=</span><span class=""> lzma</span><span class="">.</span><span class="">LZMACompressor</span><span class="">()</span><span class="">
lzc_partial </span><span class="">=</span><span class=""> partial</span><span class="">(</span><span class="">run_lzma</span><span class="">,</span><span class="">c</span><span class="">=</span><span class="">lzc</span><span class="">)</span><span class="">
out </span><span class="">=</span><span class=""> list</span><span class="">(</span><span class="">map</span><span class="">(</span><span class="">lzc_partial</span><span class="">,</span><span class="">strings</span><span class="">))</span><span class="">
out_flush </span><span class="">=</span><span class=""> lzc</span><span class="">.</span><span class="">flush</span><span class="">()</span><span class="">
result </span><span class="">=</span><span class=""> b</span><span class="">""</span><span class="">.</span><span class="">join</span><span class="">(</span><span class="">out </span><span class="">+</span><span class=""> </span><span class="">[</span><span class="">out_flush</span><span class="">])</span></code></pre>
</div></div>