Hello Juan!

Thank you for your reply! I am sorry about the technical problem, Google told me that I am signed up for this group, I did not realize. I hope this message will be recognized as a member. 

I really appreciate your tips and experience. However, I have one concern about using only intensity/color. I have several images, were the cell and the object are very light stained and others with objects which I don't want to detect are very dark stained, that's why I used HOG (the object which I am looking for has always kind of finger structure). I am giving it a try at the moment with Lab features and I will see :-)

Thanks a lot for the cross validation tip and how many images to use, this was very helpful.

Cheers,
Stefanie


Am Mittwoch, 22. April 2015 12:42:25 UTC+2 schrieb Juan Nunez-Iglesias:
Hello!

Firstly, please sign up to the mailing list before posting — if you don't, every post from you has to be manually filtered through.

On to your problem!

So, it looks like there should be plenty of signal to distinguish between object/no-object. It's key to understand the features you're using. HOG may not be appropriate here: it measures gradients, not image intensity/color. In this case, it looks like there will be many more dark pixels in the object images. What I would do based on the examples you showed is to just take Lab-transformed image and then do a histogram, and use the histogram as the feature vector.

You have a lot of labelled images, so use them! I would split your set into 40k training / 10k test, then do 4-fold cross-validation on the training set. scikit-learn has nice classes for doing cross-validation automatically.

As to the choice of classifier, it might be worth asking their list, but *by far* the easiest to use "out-of-the-box", without fiddling with parameters, is the Random Forest.

Hope that helped!

Juan.




On Wed, Apr 22, 2015 at 8:21 PM, Snowflake <lue...@gmail.com> wrote:

Hi!

I am new to machine learning and I need some help.

I want to detect objects inside cells of microscopy images. I have a lot of annotated images (app. 50.000 images with an object and 500.000 without an object).

So far I tried to extract features using HOG and classifying using logistic regression and LinearSVC. I have tried several parameters for HOG or color spaces (RGB, HSV, LAB) but I don't see a big difference, the predication rate is about 70 %.

I have several questions. How many images should I use to train the descriptor? How many images should I use to test the prediction?

I have tried with about 1000 images for training, which gives me 55 % positive and 5000, which gives me about 72 % positive. However, it also depends a lot on the test set, sometimes a test set can reach 80-90 % positive detected images.

Here are two examples containing an object and two images without an object:

Object01




Another problem is, sometimes the images contain several objects:


Should I try to increase the examples of the learning set? How should I choose the images for the training set, just random? What else could I try?

Any help or tips would be very appreciated, thank you very much in advance!



--
You received this message because you are subscribed to the Google Groups "scikit-image" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scikit-image...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.