[scikit-learn] MPLclassifier
andreas heiner
ap.heiner at gmail.com
Fri Jan 12 11:40:21 EST 2018
Hi,
I try to apply the MPLclassifier to a subset (100 data points, 2 classes)
of the 20newsgroup dataset. I created (ok, copied) the following pipeline
model_MLP = Pipeline([('vect', CountVectorizer()),
('tfidf', TfidfTransformer()),
('model_MLP', MLPClassifier(solver='lbfgs',
alpha=1e-5,
hidden_layer_sizes=(5, 2),
random_state=1)
)
])
model_MLP.fit(twenty_train.data, twenty_train.target)
predicted_MLP = model_MLP.predict(twenty_test.data)
print(metrics.classification_report(twenty_test.target, predicted_MLP,
target_names=twenty_test.target_names))
The numbers I get are hopeless,
precision recall f1-score support
alt.atheism 0.00 0.00 0.00 34
sci.electronics 0.66 1.00 0.80 66
The only reason I can think of is that the dictionaries of the training and
the test set are not the same (testset: 5204 words, training set: 5402
words). That should not be a problem (if I understand Bayes correctly), but
it certainly gives rubbish (see the numbers).
The same setup with the SVD routine works great, all values are around .95
thanks,
Andreas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20180112/8a865ea4/attachment.html>
More information about the scikit-learn
mailing list