https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca... Mostly just trying to follow procedure. I already mentioned my concerns in the previous thread. I made one stab at introduced a "None" default for the mask, but I got hung up and reverted it. The default I was going to propose was np.ones(img.shape,bool) (and after the fact I even noticed that's how it is used in one of the unit tests). But I started thinking that that could be quite wasteful of memory if you were working with large images (on my test use case with ~512x512 images it's about 300 KB for the "fake" mask). The problem I had was that if I don't allocate the emask array I get run-time errors starting at line 129 (in the diff of canny.py) because the arrays all have different lengths if they aren't logical_and'd with emask above. So anyway, I thought I would let you take a look at the code and decide if you'd like me to try to figure out a way around that or incorporate it as is (with no default) or add the np.ones default. The tests pass and as I mentioned it has worked fine so far in day-to-day use. Also regarded Ralf's comment to consolidate into one filter.py file, I agree with that I just figured I would do that with the next batch (in a separate commit). But again, let me know if I'm not approaching this properly. Thanks, Dan
I'll try to look at this this week/weekend. I actually need a canny detector next week, so I have incentive to get this pushed :) On Wed, Mar 30, 2011 at 7:02 PM, Dan Farmer <dfarmernv@gmail.com> wrote:
https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca...
Mostly just trying to follow procedure. I already mentioned my concerns in the previous thread. I made one stab at introduced a "None" default for the mask, but I got hung up and reverted it. The default I was going to propose was np.ones(img.shape,bool) (and after the fact I even noticed that's how it is used in one of the unit tests). But I started thinking that that could be quite wasteful of memory if you were working with large images (on my test use case with ~512x512 images it's about 300 KB for the "fake" mask).
The problem I had was that if I don't allocate the emask array I get run-time errors starting at line 129 (in the diff of canny.py) because the arrays all have different lengths if they aren't logical_and'd with emask above.
So anyway, I thought I would let you take a look at the code and decide if you'd like me to try to figure out a way around that or incorporate it as is (with no default) or add the np.ones default.
The tests pass and as I mentioned it has worked fine so far in day-to-day use. Also regarded Ralf's comment to consolidate into one filter.py file, I agree with that I just figured I would do that with the next batch (in a separate commit). But again, let me know if I'm not approaching this properly.
Thanks, Dan
I wouldn't be concerned about allocating a mask of bools the same size as the input array. Most of the operations on the image data are going to create temporary arrays of floats, eight times bigger. You could conserve some memory by reusing the variable: mask = binary_erosion(mask, ...) but that's sacrificing a bit of clarity for probably a not very large savings. Speaking of clarity, I would replace "emask" with "eroded_mask". Given the arguments above, I suggest moving mask to the last argument, make it a keyword with a default of None, which is replaced with ones(image.shape, dtype=bool) Also, I would move the smooth_with_function_and_mask() into canny.py and remove smooth.py, for now. The other functions in smooth.py may be needed in the future, but not yet. Thouis Jones On Thu, Mar 31, 2011 at 01:02, Dan Farmer <dfarmernv@gmail.com> wrote:
https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca...
Mostly just trying to follow procedure. I already mentioned my concerns in the previous thread. I made one stab at introduced a "None" default for the mask, but I got hung up and reverted it. The default I was going to propose was np.ones(img.shape,bool) (and after the fact I even noticed that's how it is used in one of the unit tests). But I started thinking that that could be quite wasteful of memory if you were working with large images (on my test use case with ~512x512 images it's about 300 KB for the "fake" mask).
The problem I had was that if I don't allocate the emask array I get run-time errors starting at line 129 (in the diff of canny.py) because the arrays all have different lengths if they aren't logical_and'd with emask above.
So anyway, I thought I would let you take a look at the code and decide if you'd like me to try to figure out a way around that or incorporate it as is (with no default) or add the np.ones default.
The tests pass and as I mentioned it has worked fine so far in day-to-day use. Also regarded Ralf's comment to consolidate into one filter.py file, I agree with that I just figured I would do that with the next batch (in a separate commit). But again, let me know if I'm not approaching this properly.
Thanks, Dan
Hi Dan On Thu, Mar 31, 2011 at 1:02 AM, Dan Farmer <dfarmernv@gmail.com> wrote:
https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca...
I read through your patch and made some preliminary comments.
Mostly just trying to follow procedure. I already mentioned my concerns in the previous thread. I made one stab at introduced a "None" default for the mask, but I got hung up and reverted it. The default I was going to propose was np.ones(img.shape,bool) (and after the fact I even noticed that's how it is used in one of the unit tests). But I started thinking that that could be quite wasteful of memory if you were working with large images (on my test use case with ~512x512 images it's about 300 KB for the "fake" mask).
It seems as though this specific implementation of the algorithms relies on creating the mask, so I don't think you can get away from it. The typical way to do it would be: def canny(..., mask=None, ...): if mask is None: mask = np.ones(x.shape, dtype=bool)
The problem I had was that if I don't allocate the emask array I get run-time errors starting at line 129 (in the diff of canny.py) because the arrays all have different lengths if they aren't logical_and'd with emask above.
Yes, I think the only way to avoid allocating the mask explicitly is to rewrite the algorithm in Cython, where you can modify behaviour inside the for-loop. Regards Stéfan
I will try to address your comments tonight and also see about removing smooth.py and just incorporating the one function we're using into canny.py for now as Thouis suggested. Thanks to both of you for looking at it. I'll message back when I've pushed the changes. -Dan 2011/3/31 Stéfan van der Walt <stefan@sun.ac.za>:
Hi Dan
On Thu, Mar 31, 2011 at 1:02 AM, Dan Farmer <dfarmernv@gmail.com> wrote:
https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca...
I read through your patch and made some preliminary comments.
Mostly just trying to follow procedure. I already mentioned my concerns in the previous thread. I made one stab at introduced a "None" default for the mask, but I got hung up and reverted it. The default I was going to propose was np.ones(img.shape,bool) (and after the fact I even noticed that's how it is used in one of the unit tests). But I started thinking that that could be quite wasteful of memory if you were working with large images (on my test use case with ~512x512 images it's about 300 KB for the "fake" mask).
It seems as though this specific implementation of the algorithms relies on creating the mask, so I don't think you can get away from it. The typical way to do it would be:
def canny(..., mask=None, ...): if mask is None: mask = np.ones(x.shape, dtype=bool)
The problem I had was that if I don't allocate the emask array I get run-time errors starting at line 129 (in the diff of canny.py) because the arrays all have different lengths if they aren't logical_and'd with emask above.
Yes, I think the only way to avoid allocating the mask explicitly is to rewrite the algorithm in Cython, where you can modify behaviour inside the for-loop.
Regards Stéfan
I've pushed the suggested changes : https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca... Thanks, Dan 2011/3/31 Dan Farmer <dfarmernv@gmail.com>:
I will try to address your comments tonight and also see about removing smooth.py and just incorporating the one function we're using into canny.py for now as Thouis suggested. Thanks to both of you for looking at it. I'll message back when I've pushed the changes.
-Dan
2011/3/31 Stéfan van der Walt <stefan@sun.ac.za>:
Hi Dan
On Thu, Mar 31, 2011 at 1:02 AM, Dan Farmer <dfarmernv@gmail.com> wrote:
https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca...
I read through your patch and made some preliminary comments.
Mostly just trying to follow procedure. I already mentioned my concerns in the previous thread. I made one stab at introduced a "None" default for the mask, but I got hung up and reverted it. The default I was going to propose was np.ones(img.shape,bool) (and after the fact I even noticed that's how it is used in one of the unit tests). But I started thinking that that could be quite wasteful of memory if you were working with large images (on my test use case with ~512x512 images it's about 300 KB for the "fake" mask).
It seems as though this specific implementation of the algorithms relies on creating the mask, so I don't think you can get away from it. The typical way to do it would be:
def canny(..., mask=None, ...): if mask is None: mask = np.ones(x.shape, dtype=bool)
The problem I had was that if I don't allocate the emask array I get run-time errors starting at line 129 (in the diff of canny.py) because the arrays all have different lengths if they aren't logical_and'd with emask above.
Yes, I think the only way to avoid allocating the mask explicitly is to rewrite the algorithm in Cython, where you can modify behaviour inside the for-loop.
Regards Stéfan
Looks good to me. 2011/4/2 Dan Farmer <dfarmernv@gmail.com>:
I've pushed the suggested changes : https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca...
Thanks, Dan
2011/3/31 Dan Farmer <dfarmernv@gmail.com>:
I will try to address your comments tonight and also see about removing smooth.py and just incorporating the one function we're using into canny.py for now as Thouis suggested. Thanks to both of you for looking at it. I'll message back when I've pushed the changes.
-Dan
2011/3/31 Stéfan van der Walt <stefan@sun.ac.za>:
Hi Dan
On Thu, Mar 31, 2011 at 1:02 AM, Dan Farmer <dfarmernv@gmail.com> wrote:
https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca...
I read through your patch and made some preliminary comments.
Mostly just trying to follow procedure. I already mentioned my concerns in the previous thread. I made one stab at introduced a "None" default for the mask, but I got hung up and reverted it. The default I was going to propose was np.ones(img.shape,bool) (and after the fact I even noticed that's how it is used in one of the unit tests). But I started thinking that that could be quite wasteful of memory if you were working with large images (on my test use case with ~512x512 images it's about 300 KB for the "fake" mask).
It seems as though this specific implementation of the algorithms relies on creating the mask, so I don't think you can get away from it. The typical way to do it would be:
def canny(..., mask=None, ...): if mask is None: mask = np.ones(x.shape, dtype=bool)
The problem I had was that if I don't allocate the emask array I get run-time errors starting at line 129 (in the diff of canny.py) because the arrays all have different lengths if they aren't logical_and'd with emask above.
Yes, I think the only way to avoid allocating the mask explicitly is to rewrite the algorithm in Cython, where you can modify behaviour inside the for-loop.
Regards Stéfan
Hey Dan, I just had a look at this code, and there are some things I would do before pulling in the changes: - Document which dtype is expected of the input image. It appears to work for floats, int, uint8 etc, but each gives a different output result. Since the tests are using floats, I assume a floating point image is expected. This should be documented in the docstring along with the expected range of the image. i.e. is it a 0.0 - 1.0 float image or a 0.0 - 255.0 float image. Also, the docs should state it only works on 2D grayscale images, so the user needs to convert their color image beforehand. - Document the valid ranges of values for the low threshold, and high threshold. - There is a lot of use of np.logical_* functions. These are around 20% slower than numpy's overloaded bitwise & and | operators. It seems these logical_* functions are being applied to boolean images so the two operations (logical and bitwise) are equivalent. I would use the faster of the two. It's faster and easier to read. - ndimage has a sobel function, was there a particular reason you chose to do a 2d convolution instead? 2 separable 1d convolutions as done by ndimage.sobel should be faster than a 2d convolution (especially since ndimage does optimizing cache manipulations under the covers). - Does the mask operate over any generic area, or is the mask expected to be rectangular? If it's a rectangular mask, is there really a need for it when the user could just pass in a slice to their image instead. - A minor pedantic gripe; I like to have spaces after commas in array indices and tuples. i.e. this: (1, 2), arr[:, 5:6], instead of: (1,2), arr[:,5:6]. A space immediately after a comma is recommended in PEP8. - Variable naming. Lots of single and two letter variables like 'c', 'c1, 'c2', 'w', 'm', 'cc', etc... Let's give these descriptive names. - For computing magntitude, use np.hypot(isobel, jsobel), instead of the manual computation you use. np.hypot is 2x faster on a 1000x1000 image since it doesn't create any temporaries. - inline the smoothing the function, unless you're using it elsewhere. Lastly, thanks for working and spending time on this code! Cheers! Chris 2011/4/2 Dan Farmer <dfarmernv@gmail.com>
I've pushed the suggested changes :
https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca...
Thanks, Dan
2011/3/31 Dan Farmer <dfarmernv@gmail.com>:
I will try to address your comments tonight and also see about removing smooth.py and just incorporating the one function we're using into canny.py for now as Thouis suggested. Thanks to both of you for looking at it. I'll message back when I've pushed the changes.
-Dan
2011/3/31 Stéfan van der Walt <stefan@sun.ac.za>:
Hi Dan
On Thu, Mar 31, 2011 at 1:02 AM, Dan Farmer <dfarmernv@gmail.com> wrote:
https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca...
I read through your patch and made some preliminary comments.
Mostly just trying to follow procedure. I already mentioned my concerns in the previous thread. I made one stab at introduced a "None" default for the mask, but I got hung up and reverted it. The default I was going to propose was np.ones(img.shape,bool) (and after the fact I even noticed that's how it is used in one of the unit tests). But I started thinking that that could be quite wasteful of memory if you were working with large images (on my test use case with ~512x512 images it's about 300 KB for the "fake" mask).
It seems as though this specific implementation of the algorithms relies on creating the mask, so I don't think you can get away from it. The typical way to do it would be:
def canny(..., mask=None, ...): if mask is None: mask = np.ones(x.shape, dtype=bool)
The problem I had was that if I don't allocate the emask array I get run-time errors starting at line 129 (in the diff of canny.py) because the arrays all have different lengths if they aren't logical_and'd with emask above.
Yes, I think the only way to avoid allocating the mask explicitly is to rewrite the algorithm in Cython, where you can modify behaviour inside the for-loop.
Regards Stéfan
I can answer two of these, quickly. The mask can be arbitrary. The smooth function should probably be kept separate, since in the future it will probably be used by other CellProfiler-based functions. Ray Jones On Tue, Apr 5, 2011 at 04:56, Chris Colbert <sccolbert@gmail.com> wrote:
Hey Dan, I just had a look at this code, and there are some things I would do before pulling in the changes:
Document which dtype is expected of the input image. It appears to work for floats, int, uint8 etc, but each gives a different output result. Since the tests are using floats, I assume a floating point image is expected. This should be documented in the docstring along with the expected range of the image. i.e. is it a 0.0 - 1.0 float image or a 0.0 - 255.0 float image. Also, the docs should state it only works on 2D grayscale images, so the user needs to convert their color image beforehand. Document the valid ranges of values for the low threshold, and high threshold. There is a lot of use of np.logical_* functions. These are around 20% slower than numpy's overloaded bitwise & and | operators. It seems these logical_* functions are being applied to boolean images so the two operations (logical and bitwise) are equivalent. I would use the faster of the two. It's faster and easier to read. ndimage has a sobel function, was there a particular reason you chose to do a 2d convolution instead? 2 separable 1d convolutions as done by ndimage.sobel should be faster than a 2d convolution (especially since ndimage does optimizing cache manipulations under the covers). Does the mask operate over any generic area, or is the mask expected to be rectangular? If it's a rectangular mask, is there really a need for it when the user could just pass in a slice to their image instead. A minor pedantic gripe; I like to have spaces after commas in array indices and tuples. i.e. this: (1, 2), arr[:, 5:6], instead of: (1,2), arr[:,5:6]. A space immediately after a comma is recommended in PEP8. Variable naming. Lots of single and two letter variables like 'c', 'c1, 'c2', 'w', 'm', 'cc', etc... Let's give these descriptive names. For computing magntitude, use np.hypot(isobel, jsobel), instead of the manual computation you use. np.hypot is 2x faster on a 1000x1000 image since it doesn't create any temporaries. inline the smoothing the function, unless you're using it elsewhere.
Lastly, thanks for working and spending time on this code! Cheers! Chris
2011/4/2 Dan Farmer <dfarmernv@gmail.com>
I've pushed the suggested changes :
https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca...
Thanks, Dan
2011/3/31 Dan Farmer <dfarmernv@gmail.com>:
I will try to address your comments tonight and also see about removing smooth.py and just incorporating the one function we're using into canny.py for now as Thouis suggested. Thanks to both of you for looking at it. I'll message back when I've pushed the changes.
-Dan
2011/3/31 Stéfan van der Walt <stefan@sun.ac.za>:
Hi Dan
On Thu, Mar 31, 2011 at 1:02 AM, Dan Farmer <dfarmernv@gmail.com> wrote:
https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca...
I read through your patch and made some preliminary comments.
Mostly just trying to follow procedure. I already mentioned my concerns in the previous thread. I made one stab at introduced a "None" default for the mask, but I got hung up and reverted it. The default I was going to propose was np.ones(img.shape,bool) (and after the fact I even noticed that's how it is used in one of the unit tests). But I started thinking that that could be quite wasteful of memory if you were working with large images (on my test use case with ~512x512 images it's about 300 KB for the "fake" mask).
It seems as though this specific implementation of the algorithms relies on creating the mask, so I don't think you can get away from it. The typical way to do it would be:
def canny(..., mask=None, ...): if mask is None: mask = np.ones(x.shape, dtype=bool)
The problem I had was that if I don't allocate the emask array I get run-time errors starting at line 129 (in the diff of canny.py) because the arrays all have different lengths if they aren't logical_and'd with emask above.
Yes, I think the only way to avoid allocating the mask explicitly is to rewrite the algorithm in Cython, where you can modify behaviour inside the for-loop.
Regards Stéfan
Thanks for the detailed feedback. I've pushed some more changes that I think cover everything. I left the mask and smoothing function for the moment based on Thouis feedback. https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca... -Dan On Mon, Apr 4, 2011 at 7:56 PM, Chris Colbert <sccolbert@gmail.com> wrote:
Hey Dan, I just had a look at this code, and there are some things I would do before pulling in the changes:
Document which dtype is expected of the input image. It appears to work for floats, int, uint8 etc, but each gives a different output result. Since the tests are using floats, I assume a floating point image is expected. This should be documented in the docstring along with the expected range of the image. i.e. is it a 0.0 - 1.0 float image or a 0.0 - 255.0 float image. Also, the docs should state it only works on 2D grayscale images, so the user needs to convert their color image beforehand. Document the valid ranges of values for the low threshold, and high threshold. There is a lot of use of np.logical_* functions. These are around 20% slower than numpy's overloaded bitwise & and | operators. It seems these logical_* functions are being applied to boolean images so the two operations (logical and bitwise) are equivalent. I would use the faster of the two. It's faster and easier to read. ndimage has a sobel function, was there a particular reason you chose to do a 2d convolution instead? 2 separable 1d convolutions as done by ndimage.sobel should be faster than a 2d convolution (especially since ndimage does optimizing cache manipulations under the covers). Does the mask operate over any generic area, or is the mask expected to be rectangular? If it's a rectangular mask, is there really a need for it when the user could just pass in a slice to their image instead. A minor pedantic gripe; I like to have spaces after commas in array indices and tuples. i.e. this: (1, 2), arr[:, 5:6], instead of: (1,2), arr[:,5:6]. A space immediately after a comma is recommended in PEP8. Variable naming. Lots of single and two letter variables like 'c', 'c1, 'c2', 'w', 'm', 'cc', etc... Let's give these descriptive names. For computing magntitude, use np.hypot(isobel, jsobel), instead of the manual computation you use. np.hypot is 2x faster on a 1000x1000 image since it doesn't create any temporaries. inline the smoothing the function, unless you're using it elsewhere.
Lastly, thanks for working and spending time on this code! Cheers! Chris
2011/4/2 Dan Farmer <dfarmernv@gmail.com>
I've pushed the suggested changes :
https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca...
Thanks, Dan
2011/3/31 Dan Farmer <dfarmernv@gmail.com>:
I will try to address your comments tonight and also see about removing smooth.py and just incorporating the one function we're using into canny.py for now as Thouis suggested. Thanks to both of you for looking at it. I'll message back when I've pushed the changes.
-Dan
2011/3/31 Stéfan van der Walt <stefan@sun.ac.za>:
Hi Dan
On Thu, Mar 31, 2011 at 1:02 AM, Dan Farmer <dfarmernv@gmail.com> wrote:
https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca...
I read through your patch and made some preliminary comments.
Mostly just trying to follow procedure. I already mentioned my concerns in the previous thread. I made one stab at introduced a "None" default for the mask, but I got hung up and reverted it. The default I was going to propose was np.ones(img.shape,bool) (and after the fact I even noticed that's how it is used in one of the unit tests). But I started thinking that that could be quite wasteful of memory if you were working with large images (on my test use case with ~512x512 images it's about 300 KB for the "fake" mask).
It seems as though this specific implementation of the algorithms relies on creating the mask, so I don't think you can get away from it. The typical way to do it would be:
def canny(..., mask=None, ...): if mask is None: mask = np.ones(x.shape, dtype=bool)
The problem I had was that if I don't allocate the emask array I get run-time errors starting at line 129 (in the diff of canny.py) because the arrays all have different lengths if they aren't logical_and'd with emask above.
Yes, I think the only way to avoid allocating the mask explicitly is to rewrite the algorithm in Cython, where you can modify behaviour inside the for-loop.
Regards Stéfan
Hi Dan On Tue, Apr 5, 2011 at 7:51 AM, Dan Farmer <dfarmernv@gmail.com> wrote:
Thanks for the detailed feedback. I've pushed some more changes that I think cover everything. I left the mask and smoothing function for the moment based on Thouis feedback.
https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca...
I think we're almost ready to pull! Some last nitpicks: - The docstring format requires indentation of items, e.g. image : array The image to smooth - PEP8 suggests not indenting equal signs to be aligned: not_mask = np.logical_not(mask) bleed_over = function(mask.astype(float)) ... With these and Chris's changes, I think we're good to go! Thanks a lot for your effort. Cheers Stéfan
Ok, I've pushed those changes. Thanks, Dan 2011/4/5 Stéfan van der Walt <stefan@sun.ac.za>:
Hi Dan
On Tue, Apr 5, 2011 at 7:51 AM, Dan Farmer <dfarmernv@gmail.com> wrote:
Thanks for the detailed feedback. I've pushed some more changes that I think cover everything. I left the mask and smoothing function for the moment based on Thouis feedback.
https://github.com/dfarmer/scikits.image/compare/master...dfarmer-filters-ca...
I think we're almost ready to pull!
Some last nitpicks:
- The docstring format requires indentation of items, e.g.
image : array The image to smooth
- PEP8 suggests not indenting equal signs to be aligned:
not_mask = np.logical_not(mask) bleed_over = function(mask.astype(float)) ...
With these and Chris's changes, I think we're good to go!
Thanks a lot for your effort.
Cheers Stéfan
Hi Dan 2011/4/7 Dan Farmer <dfarmernv@gmail.com>:
Ok, I've pushed those changes.
Thanks so much for all your effort! I've merged your branch. I had to make one small modification, shown here: https://github.com/stefanv/scikits.image/commit/97806c76f4c0b4fa6db00cf39d01... Cheers Stéfan
participants (5)
-
Chris Colbert
-
Dan Farmer
-
Stéfan van der Walt
-
Thouis (Ray) Jones
-
Thouis (Ray) Jones