[scikit-learn] purpose of test: check_classifiers_train

Andreas Mueller t3kcit at gmail.com
Thu Oct 12 03:01:47 EDT 2017


Yes, it's pretty empirical, and with the estimator tags PR 
(https://github.com/scikit-learn/scikit-learn/pull/8022) we will be able 
to relax it if there's a good reason you're not passing.
But the dataset is pretty trivial (iris), and you're getting chance 
performance (it's a balanced three class problem). So that is not a 
great sign for your estimator.

On 10/11/2017 07:09 PM, Guillaume Lemaître wrote:
> Not sure 100% but this is an integration/sanity check since all 
> classifiers are supposed to predict quite well and data used to train.
> This is true that 83% is empirical but it allows to spot any changes 
> done in the algorithms even if the unit tests are passing for some reason.
>
> On 11 October 2017 at 18:52, Michael Capizzi 
> <mcapizzi at email.arizona.edu <mailto:mcapizzi at email.arizona.edu>> wrote:
>
>     I’m wondering if anyone can identify the purpose of this test:
>     |check_classifiers_train()|, specifically this line:
>     https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/utils/estimator_checks.py#L1106
>     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/utils/estimator_checks.py#L1106>
>
>     My custom classifier (which I’m hoping to submit to
>     |scikit-learn-contrib|) is failing this test:
>
>     |File
>     "/Users/mcapizzi/miniconda3/envs/nb_plus_svm/lib/python3.6/site-packages/sklearn/utils/estimator_checks.py",
>     line 1106, in check_classifiers_train
>     assert_greater(accuracy_score(y, y_pred), 0.83) AssertionError:
>     0.31333333333333335 not greater than 0.83 |
>
>     And while it’s disturbing that my classifier is getting 31%
>     |accuracy| when, clearly, the test writer expects it to be in the
>     upper-80s, I’m not sure I understand why that would be a test
>     condition.
>
>     Thanks for any insight.
>
>>
>     _______________________________________________
>     scikit-learn mailing list
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>     https://mail.python.org/mailman/listinfo/scikit-learn
>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>
>
>
>
> -- 
> Guillaume Lemaitre
> INRIA Saclay - Parietal team
> Center for Data Science Paris-Saclay
> https://glemaitre.github.io/
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171012/b2058102/attachment-0001.html>


More information about the scikit-learn mailing list