Hi everyone,
I was hoping to get some theoretical details on the algorithm of find_costs
for the MCP and MCP_Geometric classes. I am not a computer scientist, so
I've had some trouble drawing conclusions from the source code alone; I
hope someone can help me :)
First, let me give you some context (although the questions don't depend on
it): I am trying to find the most efficient way possible to solve the
following problem: I have a (4359, 4950) raster and 5000 georeferenced
points. I need to find the shortest path between all these points
(basically, the output would be the triangle part of a symmetric matrix). I
have run some tests using MCP, MCP_Geometric and the package "gdistance" in
R, which claims to implement Djikistra's algorithm in the costDistance
function (it actually relies on package "igraph", which use C written
functions in the background as well). MCP.find_costs() and
MC_Geometric.find_costs() have performed consistently better, even for a
small sample of points (e.g. 10), so I am skeptical that the difference is
due to R being slower than Python (plus, I've run the examples on a jupyter
notebook). Not only that, but while MCP and MCP_Geometric don't use more
than 1GB of RAM memory, the examples in R use more than 10GB.
So, firstly, what I would like to ask is:
1- What is the algorithm implemented in MCP? Is it simply Djikstra's, some
other known algorithm, or something new?
2- If it is something new, what would be the complexity?
3- How is it so memory efficient?
At last I wanted to check if I got something straight: the difference of
MCP.find_costs() to MCP_Geometric.find_costs() is just that in
MCP_Geometric the matrix of costs needs to be recalculated before running
the algorithm, is that right? Or does it affect the internal process of
finding the shortest path in some other way?
Thank you for reading this far :) This implementation seems very unique and
efficient, and I am just trying to better understand it.
--
Brenda