[scikit-learn] How to train an image classifier on directories

Abdul Abdul abdul.sw84 at gmail.com
Sun Jan 14 18:18:57 EST 2018


Hello,

I'm trying to train an image classifier, but a bit confused on how to label
my data. The issue here is that for each class I have subdirectories, each
of which contains two images. So, it is not I have classes, and in each
class I simply have the images that come under that class (i.e. cats vs.
dogs).

I will show here some attempts for grouping the data together, but not yet
able to figure how to assign the label, and pass the pairs of images along
with the label to the image classifier.

So, that's how I simply read the two images:

im1 = cv2.imread('img1.jpg')
im1 = img_to_array(im1)

im2 = cv2.imread('img2.jpg')
im2 = img_to_array(im2)

I then *pair* the images as follows:

pair = (im1,im2)

For labeling, this is what I did:

label = root.split(os.path.sep)[-2]
label = 1 if label == 'cat' else 0

How can I group the above pairs of images (im1,im2) and attach the label to
them? Especially that I want to pass them to the following scikit-learn
 function:

(trainX, testX, trainY, testY) = train_test_split(data,
    labels, test_size=0.25, random_state=42)

Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20180114/21a6e4e0/attachment-0001.html>


More information about the scikit-learn mailing list