seeking advice on HoG applicability

Mon Jul 8 12:04:52 EDT 2013

Thanks!

Good point about the soil and roots. I've been using the entire images so 
far, but maybe I should pre-filter them, and keep only the "greenish" 
areas, before doing any feature extraction.

I have visualized the output, and it looks like the gradients mostly go 
across leaf boundaries. That may be too fine-grained, since it's probably 
the moss/background boundary that is most important for this problem. Maybe 
I should actually segment (or skeletonize?) the image to get rid of the 
internal detail.

-Lisa

On Sunday, July 7, 2013 8:00:06 PM UTC-4, Josh Warner wrote:
>
> Hi Lisa,
>
> That's an interesting problem! Off the top of my head, I have a couple 
> questions:
>
> Are you making any attempt to mask the HoG features to the above-ground, 
> green regions? The example images have the entire root structure shown, yet 
> the visual classification you described is exclusive to the greenery. The 
> roots are not well separated from the soil, either, so that entire region 
> is may be confounding your task. This would be true for any feature 
> algorithm, not just HoG. 
>
> Have you tried visualizing the HoG output with the `visualize` kwarg? This 
> could give you a sense for what HoG is actually extracting.
>
> It sounds like you may have a dimensionality problem thanks to high image 
> resolution, combined with a relatively low number of images to compare. 
> This can be partially addressed by tweaking HoG parameters (especially 
> `pixels_per_cell`, I believe) or scaling your images down to a uniform, 
> smaller size. In addition, scikit-learn has several feature selection 
> algorithms, such as PCA, which can help reduce the number a features to a 
> manageable level.
>
> If I get the chance to directly play with your example pictures, I'll pop 
> back in with a few more thoughts.
>
> Good luck,
>
> Josh
>
> On Wednesday, July 3, 2013 2:10:19 PM UTC-5, Lisa Torrey wrote:
>>
>> Hi folks -
>>
>> I'm looking at an image classification problem, and wondering whether HoG 
>> should be applicable to it. If any of you are willing to take a look and 
>> give some advice, that would be wonderful. I have a machine learning 
>> background, but working with image data is a new area for me.
>>
>> The problem is to distinguish between two types of moss. Type 1 tends to 
>> consist of upright stalks with few or no branches. Type 2 tends to have 
>> secondary branches coming off the primary stalk. There's quite a bit of 
>> visual diversity within these types. I've linked some images below.
>>
>> Type 1:
>> http://myslu.stlawu.edu/~ltorrey/moss/andrea_rothii.jpg
>> http://myslu.stlawu.edu/~ltorrey/moss/mnium_spinulosum.jpg
>>
>> Type 2:
>> http://myslu.stlawu.edu/~ltorrey/moss/climacium_americanum.jpg
>> http://myslu.stlawu.edu/~ltorrey/moss/rhytidiadelphus_triquetrus.jpg
>>
>> When I came across the Dalal paper, I thought my problem might have 
>> something in common with the pedestrian detection problem, so I tried 
>> extracting HoG features and feeding them into an SVM classifier. This 
>> failed miserably - the SVM does no better than random guessing. I'm now 
>> trying to weigh potential reasons.
>>
>> The first possible reason on my list is the diversity among mosses of the 
>> same type. There isn't necessarily a "type 1 shape" and a "type 2 shape," 
>> at least not to the degree that there's a "pedestrian shape." Perhaps this 
>> means HoG isn't really the right approach to my problem after all?
>>
>> Other reasons may include:
>> - I have much less data. (Just 77 positives and 78 negatives, compared to 
>> Dalal's 1239 and 12180.)
>> - My images aren't all the same size, like the pedestrian images are. 
>> (I'm not sure if this would matter?)
>> - My images are much higher resolution. (I've been downscaling them by a 
>> factor of 8, but the feature vectors are still enormous.)
>> - I'm just using default parameters so far. (In the absence of any 
>> signal, tweaking seems unproductive.)
>>
>> Any thoughts or suggestions would be welcome!
>>
>> -Lisa
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-image/attachments/20130708/50b1e880/attachment.html>