Re: Review: Canny

5 Apr 2011

      I can answer two of these, quickly.

The mask can be arbitrary.

The smooth function should probably be kept separate, since in the
future it will probably be used by other CellProfiler-based functions.

Ray Jones

On Tue, Apr 5, 2011 at 04:56, Chris Colbert <sccolbert@gmail.com> wrote:
...
Hey Dan,
I just had a look at this code, and there are some things I would do before
pulling in the changes:
Document which dtype is expected of the input image. It appears to work for
floats, int, uint8 etc, but each gives a different output result. Since the
tests are using floats, I assume a floating point image is expected. This
should be documented in the docstring along with the expected range of the
image. i.e. is it a 0.0 - 1.0 float image or a 0.0 - 255.0 float image.
Also, the docs should state it only works on 2D grayscale images, so the
user needs to convert their color image beforehand.
Document the valid ranges of values for the low threshold, and high
threshold.
There is a lot of use of np.logical_* functions. These are around 20% slower
than numpy's overloaded bitwise & and | operators. It seems these logical_*
functions are being applied to boolean images so the two operations (logical
and bitwise) are equivalent. I would use the faster of the two. It's faster
and easier to read.
ndimage has a sobel function, was there a particular reason you chose to do
a 2d convolution instead? 2 separable 1d convolutions as done by
ndimage.sobel should be faster than a 2d convolution (especially since
ndimage does optimizing cache manipulations under the covers).
Does the mask operate over any generic area, or is the mask expected to be
rectangular? If it's a rectangular mask, is there really a need for it when
the user could just pass in a slice to their image instead.
A minor pedantic gripe; I like to have spaces after commas in array indices
and tuples. i.e. this: (1, 2), arr[:, 5:6], instead of: (1,2), arr[:,5:6]. A
space immediately after a comma is recommended in PEP8.
Variable naming. Lots of single and two letter variables like 'c', 'c1,
'c2', 'w', 'm', 'cc', etc... Let's give these descriptive names.
For computing magntitude, use np.hypot(isobel, jsobel), instead of the
manual computation you use. np.hypot is 2x faster on a 1000x1000 image since
it doesn't create any temporaries.
inline the smoothing the function, unless you're using it elsewhere.
Lastly, thanks for working and spending time on this code!
Cheers!
Chris
2011/4/2 Dan Farmer <dfarmernv@gmail.com>
...
I've pushed the suggested changes :
https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca...
Thanks,
Dan
2011/3/31 Dan Farmer <dfarmernv@gmail.com>:
...
I will try to address your comments tonight and also see about
removing smooth.py and just incorporating the one function we're using
into canny.py for now as Thouis suggested. Thanks to both of you for
looking at it. I'll message back when I've pushed the changes.
-Dan
2011/3/31 Stéfan van der Walt <stefan@sun.ac.za>:
...
Hi Dan
On Thu, Mar 31, 2011 at 1:02 AM, Dan Farmer <dfarmernv@gmail.com>
wrote:
...
https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca...
I read through your patch and made some preliminary comments.
...
Mostly just trying to follow procedure. I already mentioned my
concerns in the previous thread. I made one stab at introduced a
"None" default for the mask, but I got hung up and reverted it. The
default I was going to propose was np.ones(img.shape,bool) (and after
the fact I even noticed that's how it is used in one of the unit
tests). But I started thinking that that could be quite wasteful of
memory if you were working with large images (on my test use case with
~512x512 images it's about 300 KB for the "fake" mask).
It seems as though this specific implementation of the algorithms
relies on creating the mask, so I don't think you can get away from
it.  The typical way to do it would be:
def canny(..., mask=None, ...):
   if mask is None:
       mask = np.ones(x.shape, dtype=bool)
...
The problem I had was that if I don't allocate the emask array I get
run-time errors starting at line 129 (in the diff of canny.py) because
the arrays all have different lengths if they aren't logical_and'd
with emask above.
Yes, I think the only way to avoid allocating the mask explicitly is
to rewrite the algorithm in Cython, where you can modify behaviour
inside the for-loop.
Regards
Stéfan

Re: Review: Canny

Thouis (Ray) Jones