random walker segmentation
Hello, a while ago, I contributed to skimage an implementation of the random walker segmentation algorithm (which has been improved and extended by many others since then). This algorithm computes a multilabel segmentation using seeds (already labeled pixels), by determining for an unlabeled pixel the probability that a seed diffuses to the pixel (with an anisotropic diffusion coefficient depending on the gradient between neighboring pixels). In the current implementation in skimage, the computation of the probability map is done by inverting a large sparse linear system (involving the Laplacian of the graph of pixels). Different methods can be chosen to solve the linear system: a brute force inversion only works for tiny images; a conjugate gradient method works well but is quite slow. If the package pyamg is available, a multigrid method is used to compute a preconditioner, which speeds up the algorithm  but it requires pyamg. Also, the memory cost of the algorithm is important (linear, I think, though. I haven't yet taken the time to use a memory profiler but I should). Recently, a colleague brought to my attention that the linear system was just a set of fixedpoint equations, that could be solved iteratively. Indeed, the solution verifies that the probability of a pixel is the weighted sum (with weights on edges that are a decreasing function of gradients) of the probabilities of its neighbors. I have written a quick and dirty implementation (only for 2D grayscale images and for 2 labels) of this "local" version, available on https://github.com/emmanuelle/scikits.image/blob/local_random_walker/skimage... It turns out that this implementation is slower than the conjugate gradient with multigrid acceleration (typically 2 to three times slower), but it has several advantages. First, it can be as fast as the "simple" conjugate gradient (without pyamg's muligrid acceleration), which is the mode that most users will use (we don't expect users to install pymag when they are just trying out algorithms). Second, its memory cost is lower (for example, the weight of an edge is stored only once, while it appears twice in the Laplacian matrix). Finally, because the operations only involve neighboring pixels, it is possible that further speed optimization can be done (using cython... or maybe a GPU implementation, even if we're not that far yet with skimage). So, should we replace the linear algebra implementation with this simpler local and iterative implementation ? I'd be interested in knowing about your opinion. Cheers, Emmanuelle
Hi Emmanuelle, I think this would be a good step forward, and would be happy to help! My guess is that this approach could be sped up further in Cython with code we control, making it easier to maintain and expose similar performance to all users. The memory issue is a real one, particularly for larger (multichannel) datasets. That alone probably justifies the addition, even at the expense of a small performance hit. But we'll see how small we can make the hit ;) Also, the iterative solution framework might be useful for other algorithms. Josh On Tuesday, July 30, 2013 3:50:16 PM UTC5, Emmanuelle Gouillart wrote:
Hello,
a while ago, I contributed to skimage an implementation of the random walker segmentation algorithm (which has been improved and extended by many others since then). This algorithm computes a multilabel segmentation using seeds (already labeled pixels), by determining for an unlabeled pixel the probability that a seed diffuses to the pixel (with an anisotropic diffusion coefficient depending on the gradient between neighboring pixels).
In the current implementation in skimage, the computation of the probability map is done by inverting a large sparse linear system (involving the Laplacian of the graph of pixels). Different methods can be chosen to solve the linear system: a brute force inversion only works for tiny images; a conjugate gradient method works well but is quite slow. If the package pyamg is available, a multigrid method is used to compute a preconditioner, which speeds up the algorithm  but it requires pyamg. Also, the memory cost of the algorithm is important (linear, I think, though. I haven't yet taken the time to use a memory profiler but I should).
Recently, a colleague brought to my attention that the linear system was just a set of fixedpoint equations, that could be solved iteratively. Indeed, the solution verifies that the probability of a pixel is the weighted sum (with weights on edges that are a decreasing function of gradients) of the probabilities of its neighbors. I have written a quick and dirty implementation (only for 2D grayscale images and for 2 labels) of this "local" version, available on
https://github.com/emmanuelle/scikits.image/blob/local_random_walker/skimage...
It turns out that this implementation is slower than the conjugate gradient with multigrid acceleration (typically 2 to three times slower), but it has several advantages. First, it can be as fast as the "simple" conjugate gradient (without pyamg's muligrid acceleration), which is the mode that most users will use (we don't expect users to install pymag when they are just trying out algorithms). Second, its memory cost is lower (for example, the weight of an edge is stored only once, while it appears twice in the Laplacian matrix). Finally, because the operations only involve neighboring pixels, it is possible that further speed optimization can be done (using cython... or maybe a GPU implementation, even if we're not that far yet with skimage).
So, should we replace the linear algebra implementation with this simpler local and iterative implementation ? I'd be interested in knowing about your opinion.
Cheers, Emmanuelle
Hi Josh, thanks for your answer! I'll probably work on the new implementation during the Euroscipy sprint, we'll see whether Cython can decrese the small performance hit :). Cheers, Emmanuelle On Wed, Jul 31, 2013 at 03:31:19PM 0700, Josh Warner wrote:
Hi Emmanuelle,
I think this would be a good step forward, and would be happy to help! My guess is that this approach could be sped up further in Cython with code we control, making it easier to maintain and expose similar performance to all users.
The memory issue is a real one, particularly for larger (multichannel) datasets. That alone probably justifies the addition, even at the expense of a small performance hit. But we'll see how small we can make the hit ;)
Also, the iterative solution framework might be useful for other algorithms.
Josh
On Tuesday, July 30, 2013 3:50:16 PM UTC5, Emmanuelle Gouillart wrote:
Hello,
a while ago, I contributed to skimage an implementation of the random walker segmentation algorithm (which has been improved and extended by many others since then). This algorithm computes a multilabel segmentation using seeds (already labeled pixels), by determining for an unlabeled pixel the probability that a seed diffuses to the pixel (with an anisotropic diffusion coefficient depending on the gradient between neighboring pixels).
In the current implementation in skimage, the computation of the probability map is done by inverting a large sparse linear system (involving the Laplacian of the graph of pixels). Different methods can be chosen to solve the linear system: a brute force inversion only works for tiny images; a conjugate gradient method works well but is quite slow. If the package pyamg is available, a multigrid method is used to compute a preconditioner, which speeds up the algorithm  but it requires pyamg. Also, the memory cost of the algorithm is important (linear, I think, though. I haven't yet taken the time to use a memory profiler but I should).
Recently, a colleague brought to my attention that the linear system was just a set of fixedpoint equations, that could be solved iteratively. Indeed, the solution verifies that the probability of a pixel is the weighted sum (with weights on edges that are a decreasing function of gradients) of the probabilities of its neighbors. I have written a quick and dirty implementation (only for 2D grayscale images and for 2 labels) of this "local" version, available on https://github.com/emmanuelle/scikits.image/blob/local_random_walker/ skimage/segmentation/local_random_walker.py
It turns out that this implementation is slower than the conjugate gradient with multigrid acceleration (typically 2 to three times slower), but it has several advantages. First, it can be as fast as the "simple" conjugate gradient (without pyamg's muligrid acceleration), which is the mode that most users will use (we don't expect users to install pymag when they are just trying out algorithms). Second, its memory cost is lower (for example, the weight of an edge is stored only once, while it appears twice in the Laplacian matrix). Finally, because the operations only involve neighboring pixels, it is possible that further speed optimization can be done (using cython... or maybe a GPU implementation, even if we're not that far yet with skimage).
So, should we replace the linear algebra implementation with this simpler local and iterative implementation ? I'd be interested in knowing about your opinion.
Cheers, Emmanuelle
On Mon, Aug 5, 2013 at 8:36 AM, Emmanuelle Gouillart <emmanuelle.gouillart@nsup.org> wrote:
thanks for your answer! I'll probably work on the new implementation during the Euroscipy sprint, we'll see whether Cython can decrese the small performance hit :).
Should we file an issue for this, so that we can track the technical aspects of the discussion there until a PR can be made? Stéfan
participants (3)

Emmanuelle Gouillart

Josh Warner

Stéfan van der Walt