[scikit-learn] NB-SVM Implementation

Olivier Grisel olivier.grisel at ensta.org
Tue Jun 7 04:11:46 EDT 2016


I think it could be implemented as a preprocessing step: this is the
approach followed by:
https://github.com/ryankiros/skip-thoughts/blob/master/eval_classification.py

Note that in that case LogisticRegression is used as the final
classifier instead of a squared hinge loss SVM but that should not
change much in practice.

If you want to make this approach scikit-learn compatible (to work
with the Pipeline and sklearn's model selection tools for instance) be
sure to implement the Transformer API as documented here:

http://scikit-learn.org/dev/developers/contributing.html#apis-of-scikit-learn-objects

Read the rest of the contributions guide:

http://scikit-learn.org/dev/developers

NBSVM is quite recent and might not strictly follow the conditions for
inclusion as stated in:

http://scikit-learn.org/stable/faq.html#can-i-add-this-new-algorithm-that-i-or-someone-else-just-published

It already has 163 citations though:

https://scholar.google.com/scholar?oi=bibs&hl=en&cites=1710642630990759287

As this is a really strong baseline and the model is not complex and
should blend well within the scikit-learn API I would be +1 for
inclusion in sklearn.

-- 
Olivier


More information about the scikit-learn mailing list