Re: Review: Canny

4 Apr 2011

      Hey Dan,

I just had a look at this code, and there are some things I would do before
pulling in the changes:

   - Document which dtype is expected of the input image. It appears to work
   for floats, int, uint8 etc, but each gives a different output result. Since
   the tests are using floats, I assume a floating point image is expected.
   This should be documented in the docstring along with the expected range of
   the image. i.e. is it a 0.0 - 1.0 float image or a 0.0 - 255.0 float image.
   Also, the docs should state it only works on 2D grayscale images, so the
   user needs to convert their color image beforehand.
   - Document the valid ranges of values for the low threshold, and high
   threshold.
   - There is a lot of use of np.logical_* functions. These are around 20%
   slower than numpy's overloaded bitwise & and | operators. It seems these
   logical_* functions are being applied to boolean images so the two
   operations (logical and bitwise) are equivalent. I would use the faster of
   the two. It's faster and easier to read.
   - ndimage has a sobel function, was there a particular reason you chose
   to do a 2d convolution instead? 2 separable 1d convolutions as done by
   ndimage.sobel should be faster than a 2d convolution (especially since
   ndimage does optimizing cache manipulations under the covers).
   - Does the mask operate over any generic area, or is the mask expected to
   be rectangular? If it's a rectangular mask, is there really a need for it
   when the user could just pass in a slice to their image instead.
   - A minor pedantic gripe; I like to have spaces after commas in array
   indices and tuples. i.e. this: (1, 2), arr[:, 5:6], instead of: (1,2),
   arr[:,5:6]. A space immediately after a comma is recommended in PEP8.
   - Variable naming. Lots of single and two letter variables like 'c', 'c1,
   'c2', 'w', 'm', 'cc', etc... Let's give these descriptive names.
   - For computing magntitude, use np.hypot(isobel, jsobel), instead of the
   manual computation you use. np.hypot is 2x faster on a 1000x1000 image since
   it doesn't create any temporaries.
   - inline the smoothing the function, unless you're using it elsewhere.

Lastly, thanks for working and spending time on this code!

Cheers!

Chris

2011/4/2 Dan Farmer <dfarmernv@gmail.com>
...
I've pushed the suggested changes :
https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca...
Thanks,
Dan
2011/3/31 Dan Farmer <dfarmernv@gmail.com>:
...
I will try to address your comments tonight and also see about
removing smooth.py and just incorporating the one function we're using
into canny.py for now as Thouis suggested. Thanks to both of you for
looking at it. I'll message back when I've pushed the changes.
-Dan
2011/3/31 Stéfan van der Walt <stefan@sun.ac.za>:
...
Hi Dan
On Thu, Mar 31, 2011 at 1:02 AM, Dan Farmer <dfarmernv@gmail.com>
wrote:
...
https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca...
I read through your patch and made some preliminary comments.
...
Mostly just trying to follow procedure. I already mentioned my
concerns in the previous thread. I made one stab at introduced a
"None" default for the mask, but I got hung up and reverted it. The
default I was going to propose was np.ones(img.shape,bool) (and after
the fact I even noticed that's how it is used in one of the unit
tests). But I started thinking that that could be quite wasteful of
memory if you were working with large images (on my test use case with
~512x512 images it's about 300 KB for the "fake" mask).
It seems as though this specific implementation of the algorithms
relies on creating the mask, so I don't think you can get away from
it.  The typical way to do it would be:
def canny(..., mask=None, ...):
   if mask is None:
       mask = np.ones(x.shape, dtype=bool)
...
The problem I had was that if I don't allocate the emask array I get
run-time errors starting at line 129 (in the diff of canny.py) because
the arrays all have different lengths if they aren't logical_and'd
with emask above.
Yes, I think the only way to avoid allocating the mask explicitly is
to rewrite the algorithm in Cython, where you can modify behaviour
inside the for-loop.
Regards
Stéfan

Re: Review: Canny

Chris Colbert