Hi everyone,
I was hoping to get some theoretical details on the algorithm of find_costs for the MCP and MCP_Geometric classes. I am not a computer scientist, so I've had some trouble drawing conclusions from the source code alone; I hope someone can help me :)
First, let me give you some context (although the questions don't depend on it): I am trying to find the most efficient way possible to solve the following problem: I have a (4359, 4950) raster and 5000 georeferenced points. I need to find the shortest path between all these points (basically, the output would be the triangle part of a symmetric matrix). I have run some tests using MCP, MCP_Geometric and the package "gdistance" in R, which claims to implement Djikistra's algorithm in the costDistance function (it actually relies on package "igraph", which use C written functions in the background as well). MCP.find_costs() and MC_Geometric.find_costs() have performed consistently better, even for a small sample of points (e.g. 10), so I am skeptical that the difference is due to R being slower than Python (plus, I've run the examples on a jupyter notebook). Not only that, but while MCP and MCP_Geometric don't use more than 1GB of RAM memory, the examples in R use more than 10GB.
So, firstly, what I would like to ask is:
1- What is the algorithm implemented in MCP? Is it simply Djikstra's, some other known algorithm, or something new?
2- If it is something new, what would be the complexity?
3- How is it so memory efficient?
At last I wanted to check if I got something straight: the difference of MCP.find_costs() to MCP_Geometric.find_costs() is just that in MCP_Geometric the matrix of costs needs to be recalculated before running the algorithm, is that right? Or does it affect the internal process of finding the shortest path in some other way?
Thank you for reading this far :) This implementation seems very unique and efficient, and I am just trying to better understand it.