[scikit-learn] purpose of test: check_classifiers_train

Thu Oct 12 14:27:22 EDT 2017

Thanks @andreas, for your comments, especially the info that it's the
`iris` dataset.  I have to dig a bit deeper to see what's going on with the
performance there.  But now that I know it's `iris`, I can try to recreate.

-M

On Thu, Oct 12, 2017 at 12:01 AM, Andreas Mueller <t3kcit at gmail.com> wrote:

> Yes, it's pretty empirical, and with the estimator tags PR (
> https://github.com/scikit-learn/scikit-learn/pull/8022) we will be able
> to relax it if there's a good reason you're not passing.
> But the dataset is pretty trivial (iris), and you're getting chance
> performance (it's a balanced three class problem). So that is not a great
> sign for your estimator.
>
>
> On 10/11/2017 07:09 PM, Guillaume Lemaître wrote:
>
> Not sure 100% but this is an integration/sanity check since all
> classifiers are supposed to predict quite well and data used to train.
> This is true that 83% is empirical but it allows to spot any changes done
> in the algorithms even if the unit tests are passing for some reason.
>
> On 11 October 2017 at 18:52, Michael Capizzi <mcapizzi at email.arizona.edu>
> wrote:
>
>> I’m wondering if anyone can identify the purpose of this test:
>> check_classifiers_train(), specifically this line:
>> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/
>> sklearn/utils/estimator_checks.py#L1106
>>
>> My custom classifier (which I’m hoping to submit to scikit-learn-contrib)
>> is failing this test:
>>
>>   File "/Users/mcapizzi/miniconda3/envs/nb_plus_svm/lib/python3.6/site-packages/sklearn/utils/estimator_checks.py", line 1106, in check_classifiers_train
>>     assert_greater(accuracy_score(y, y_pred), 0.83)
>> AssertionError: 0.31333333333333335 not greater than 0.83
>>
>> And while it’s disturbing that my classifier is getting 31% accuracy
>> when, clearly, the test writer expects it to be in the upper-80s, I’m not
>> sure I understand why that would be a test condition.
>>
>> Thanks for any insight.
>> 
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>
>
> --
> Guillaume Lemaitre
> INRIA Saclay - Parietal team
> Center for Data Science Paris-Saclay
> https://glemaitre.github.io/
>
>
> _______________________________________________
> scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171012/23c67d28/attachment.html>