[scikit-learn] Fwd: sample_weight parameter is not split when used in GridSearchCV

Andreas Mueller t3kcit at gmail.com
Tue Jun 27 13:43:44 EDT 2017


We could clarify in the documentation that you can grid-search any 
(hyper) parameter of a model,
but not parameters to fit?
Only the values returned by get_params() can be tuned.
Only "param_grid" will be searched, not "fit_params". "fit_params" can 
contain only a single setting.



On 06/26/2017 03:17 AM, Joel Nothman wrote:
> I don't think we'll be accepting a pull request adding this feature to 
> scikit-learn. It is too niche. But you should go ahead and modify the 
> search to operate over weightings for your own research. If you feel 
> the documentation can be clarified, a pull request there is welcome.
>
> On 26 June 2017 at 16:43, Manuel CASTEJÓN LIMAS <mcasl at unileon.es 
> <mailto:mcasl at unileon.es>> wrote:
>
>     Yes, I guess most users will be happy without using weights. Some
>     will need to use one single vector, but I am currently researching
>     a weighting method thus my need of evaluating multiple weight vectors.
>
>      I understand that it seems to be a very specific issue with a
>     simple workaround, most likely not worthy of any programming
>      effort yet as there are more important issues to address.
>
>     I guess that adding a note on this behaviour on the documentation
>     could be great. If some parameters can be iterated and others are
>     not supported knowing it  provides a more solid ground to the user
>     base.
>
>     I'm committed to spend a few hours studying the code. Should I be
>     successful  I will come again with a pull request.
>     I'll cross my fingers :-)
>     Best
>     Manolo
>
>
>
>     El 24 jun. 2017 20:05, "Julio Antonio Soto de Vicente"
>     <julio at esbet.es <mailto:julio at esbet.es>> escribió:
>
>         Joel is right.
>
>         In fact, you usually don't want to tune a lot the sample
>         weights: you may leave them default, set them in order to
>         balance classes, or fix them according to some business rule.
>
>         That said, you can always run a couple of grid searchs
>         changing that sample weights and compare results afterwards.
>
>         -- 
>         Julio
>
>         El 24 jun 2017, a las 15:51, Joel Nothman
>         <joel.nothman at gmail.com <mailto:joel.nothman at gmail.com>> escribió:
>
>>         yes, trying multiple sample weightings is not supported by
>>         grid search directly.
>>
>>         On 23 Jun 2017 6:36 pm, "Manuel Castejón Limas"
>>         <manuel.castejon at gmail.com
>>         <mailto:manuel.castejon at gmail.com>> wrote:
>>
>>             Dear Joel,
>>
>>             I tried and removed the square brackets and now it works
>>             as expected *for a single* sample_weight vector:
>>
>>             |validator = GridSearchCV(my_Regressor,
>>             param_grid={'number_of_hidden_neurons': range(4, 5),
>>             'epochs': [50], }, fit_params={'sample_weight':
>>             my_sample_weights }, n_jobs=1, ) validator.fit(x, y)|
>>
>>             The problem now is that I want to try multiple trainings
>>             with multiple sample_weight parameters, in the following
>>             fashion:
>>
>>             |validator = GridSearchCV(my_Regressor,
>>             param_grid={'number_of_hidden_neurons': range(4, 5),
>>             'epochs': [50], 'sample_weight': [my_sample_weights,
>>             my_sample_weights**2] , }, fit_params={}, n_jobs=1, )
>>             validator.fit(x, y)|
>>
>>             But unfortunately it produces the same error again:
>>
>>             ValueError: Found a sample_weight array with shape
>>             (1000,) for an input with shape (666, 1). sample_weight
>>             cannot be broadcast.
>>
>>             I guess that the issue is that the sample__weight
>>             parameter was not thought to be changed during the
>>             tuning, was it?
>>
>>
>>             Thank you all for your patience and support.
>>             Best
>>             Manolo
>>
>>
>>
>>
>>             2017-06-23 1:17 GMT+02:00 Manuel CASTEJÓN LIMAS
>>             <mcasl at unileon.es <mailto:mcasl at unileon.es>>:
>>
>>                 Dear Joel,
>>                 I'm just passing an iterable as I would do with any
>>                 other sequence of parameters to tune. In this case
>>                 the list only has one element to use but in general I
>>                 ought to be able to pass a collection of vectors.
>>                 Anyway, I guess that that issue is not the cause of
>>                 the problem.
>>
>>                 El 23 jun. 2017 1:04 a. m., "Joel Nothman"
>>                 <joel.nothman at gmail.com
>>                 <mailto:joel.nothman at gmail.com>> escribió:
>>
>>                     why are you passing [my_sample_weights] rather
>>                     than just my_sample_weights?
>>
>>
>>             _______________________________________________
>>             scikit-learn mailing list
>>             scikit-learn at python.org <mailto:scikit-learn at python.org>
>>             https://mail.python.org/mailman/listinfo/scikit-learn
>>             <https://mail.python.org/mailman/listinfo/scikit-learn>
>>
>>         _______________________________________________
>>         scikit-learn mailing list
>>         scikit-learn at python.org <mailto:scikit-learn at python.org>
>>         https://mail.python.org/mailman/listinfo/scikit-learn
>>         <https://mail.python.org/mailman/listinfo/scikit-learn>
>
>         _______________________________________________
>         scikit-learn mailing list
>         scikit-learn at python.org <mailto:scikit-learn at python.org>
>         https://mail.python.org/mailman/listinfo/scikit-learn
>         <https://mail.python.org/mailman/listinfo/scikit-learn>
>
>
>
>     _______________________________________________
>     scikit-learn mailing list
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>     https://mail.python.org/mailman/listinfo/scikit-learn
>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170627/20c2b698/attachment-0001.html>


More information about the scikit-learn mailing list