Machine learning for PU data
Hi All, I am a scikit-learn user and have a question for the community, if anyone has applied any available machine learning algorithms in the scikit-learn package for data with positive and unlabeled class only? If so would you share some insight with me. I understand this could be a broader topic but I am new to analyzing PU data and hence can use some help. Thanks, Ruchika
Hello Ruchika, I don't think that scikit-learn currently has algorithms that can train with positive and unlabeled class labels only. However, you could try one of the following compatible wrappers, - http://nktmemo.github.io/jekyll/update/2015/11/07/pu_classification.html - https://github.com/scikit-learn/scikit-learn/pull/371 (haven't tried them myself). Also, you could try one class SVM as suggested here https://stackoverflow.com/questions/25700724/binary-semi-supervised-classifi... -- Roman On 30/06/17 16:06, Ruchika Nayyar wrote:
Hi All,
I am a scikit-learn user and have a question for the community, if anyone has applied any available machine learning algorithms in the scikit-learn package for data with positive and unlabeled class only? If so would you share some insight with me. I understand this could be a broader topic but I am new to analyzing PU data and hence can use some help.
Thanks, Ruchika
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Hello, As mentioned by Roman, you can try the one-class scikit-learn algorithms such as OneClassSVM, IsolationForest, LocalOutlierFactor (with the private predict method) or EllipticEnvelope. Hope this helps Nicolas On Fri, Jun 30, 2017 at 3:39 PM, Roman Yurchak <rth.yurchak@gmail.com> wrote:
Hello Ruchika,
I don't think that scikit-learn currently has algorithms that can train with positive and unlabeled class labels only. However, you could try one of the following compatible wrappers, - http://nktmemo.github.io/jekyll/update/2015/11/07/pu_classif ication.html - https://github.com/scikit-learn/scikit-learn/pull/371
(haven't tried them myself).
Also, you could try one class SVM as suggested here https://stackoverflow.com/questions/25700724/binary-semi- supervised-classification-with-positive-only-and-unlabeled-data-set
-- Roman
On 30/06/17 16:06, Ruchika Nayyar wrote:
Hi All,
I am a scikit-learn user and have a question for the community, if anyone has applied any available machine learning algorithms in the scikit-learn package for data with positive and unlabeled class only? If so would you share some insight with me. I understand this could be a broader topic but I am new to analyzing PU data and hence can use some help.
Thanks, Ruchika
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
participants (3)
-
Nicolas Goix -
Roman Yurchak -
Ruchika Nayyar