minimum cost MCP with IPython Parallel
Hi, I am trying to use graph.MCP_Geometric(costs) on a cluster. The script and methods I use work on my personal computer which has 64GB memory. I am now trying to implement the same code on a cluster using IPython parallel. Here's the problem, each node/task only has about 20GB memory so they cannot even complete the mcp=graph.MCP_Geometric(costs) command. My cost data has shape (12837, 43345). I am trying IPython parallel since I think it should allow me to overcome the lack of memory in each node. E.g. If I define a parallel function like @dview.parallel(block=True) def costsnx(costs): mcp=graph.MCP_Geometric(costs) return mcp and run it mcpp=costsnx(costs) I get a list with mcp objects like <skimage.graph._mcp.MCP_Geometric at 0x2d6fdd0>, one for each process I created. So in principle this would solve the memory problem, but the individual mcp objects do not seem to work e.g. executing mcpp[30].find_costs([[0,0]]) generates TypeError: object of type 'NoneType' has no len() Also, there is no way to join them to recreate the real graph. I tried also including the mcp.find_costs(location) command in the parallel function, but it cannot find the location. I was thinking that perhaps there would be a way to use the intermediate steps of the graph.MCP_Geometric function to get the indices or something that could be constructed in parallel and then joined back into a unique element to compute minimum_costs. Any ideas on how this could be implemented? Or any idea on how to tackle the problem at all? I appreciate any help or pointers. Thanks OOM
participants (1)

OOM