minimum cost MCP with IPython Parallel
OOM
omer.ozak at gmail.com
Mon Jan 19 16:16:31 EST 2015
Hi,
I am trying to use graph.MCP_Geometric(costs) on a cluster. The script and
methods I use work on my personal computer which has 64GB memory. I am now
trying to implement the same code on a cluster using IPython parallel.
Here's the problem, each node/task only has about 20GB memory so they
cannot even complete the
mcp=graph.MCP_Geometric(costs)
command. My cost data has shape (12837, 43345).
I am trying IPython parallel since I think it should allow me to overcome
the lack of memory in each node. E.g. If I define a parallel function like
@dview.parallel(block=True)
def costsnx(costs):
mcp=graph.MCP_Geometric(costs)
return mcp
and run it
mcpp=costsnx(costs)
I get a list with mcp objects like <skimage.graph._mcp.MCP_Geometric at
0x2d6fdd0>, one for each process I created. So in principle this would
solve the memory problem, but the individual mcp objects do not seem to
work e.g. executing
mcpp[30].find_costs([[0,0]])
generates
TypeError: object of type 'NoneType' has no len()
Also, there is no way to join them to recreate the real graph. I tried also
including the
mcp.find_costs(location)
command in the parallel function, but it cannot find the location. I was
thinking that perhaps there would be a way to use the intermediate steps of
the graph.MCP_Geometric function to get the indices or something that could
be constructed in parallel and then joined back into a unique element to
compute minimum_costs.
Any ideas on how this could be implemented? Or any idea on how to tackle
the problem at all?
I appreciate any help or pointers.
Thanks
OOM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-image/attachments/20150119/aa0bcb34/attachment.html>
More information about the scikit-image
mailing list