[scikit-learn] Fwd: sample_weight parameter is not split when used in GridSearchCV

Joel Nothman joel.nothman at gmail.com
Mon Jun 26 03:17:02 EDT 2017


I don't think we'll be accepting a pull request adding this feature to
scikit-learn. It is too niche. But you should go ahead and modify the
search to operate over weightings for your own research. If you feel the
documentation can be clarified, a pull request there is welcome.

On 26 June 2017 at 16:43, Manuel CASTEJÓN LIMAS <mcasl at unileon.es> wrote:

> Yes, I guess most users will be happy without using weights. Some will
> need to use one single vector, but I am currently researching a weighting
> method thus my need of evaluating multiple weight vectors.
>
>  I understand that it seems to be a very specific issue with a simple
> workaround, most likely not worthy of any programming  effort yet as there
> are more important issues to address.
>
> I guess that adding a note on this behaviour on the documentation could be
> great. If some parameters can be iterated and others are not supported
> knowing it  provides a more solid ground to the user base.
>
> I'm committed to spend a few hours studying the code. Should I be
> successful  I will come again with a pull request.
> I'll cross my fingers :-)
> Best
> Manolo
>
>
>
> El 24 jun. 2017 20:05, "Julio Antonio Soto de Vicente" <julio at esbet.es>
> escribió:
>
> Joel is right.
>
> In fact, you usually don't want to tune a lot the sample weights: you may
> leave them default, set them in order to balance classes, or fix them
> according to some business rule.
>
> That said, you can always run a couple of grid searchs changing that
> sample weights and compare results afterwards.
>
> --
> Julio
>
> El 24 jun 2017, a las 15:51, Joel Nothman <joel.nothman at gmail.com>
> escribió:
>
> yes, trying multiple sample weightings is not supported by grid search
> directly.
>
> On 23 Jun 2017 6:36 pm, "Manuel Castejón Limas" <manuel.castejon at gmail.com>
> wrote:
>
>> Dear Joel,
>>
>> I tried and removed the square brackets and now it works as expected *for
>> a single* sample_weight vector:
>>
>> validator = GridSearchCV(my_Regressor,
>>                      param_grid={'number_of_hidden_neurons': range(4, 5),
>>                                  'epochs': [50],
>>                                 },
>>                      fit_params={'sample_weight':  my_sample_weights },
>>                      n_jobs=1,
>>                     )
>> validator.fit(x, y)
>>
>> The problem now is that I want to try multiple trainings with multiple
>> sample_weight parameters, in the following fashion:
>>
>> validator = GridSearchCV(my_Regressor,
>>                      param_grid={'number_of_hidden_neurons': range(4, 5),
>>                                  'epochs': [50],
>>                                  'sample_weight':  [my_sample_weights, my_sample_weights**2] ,
>>                                 },
>>                      fit_params={},
>>                      n_jobs=1,
>>                     )
>> validator.fit(x, y)
>>
>> But unfortunately it produces the same error again:
>>
>> ValueError: Found a sample_weight array with shape (1000,) for an input
>> with shape (666, 1). sample_weight cannot be broadcast.
>>
>> I guess that the issue is that the sample__weight parameter was not
>> thought to be changed during the tuning, was it?
>>
>>
>> Thank you all for your patience and support.
>> Best
>> Manolo
>>
>>
>>
>>
>> 2017-06-23 1:17 GMT+02:00 Manuel CASTEJÓN LIMAS <mcasl at unileon.es>:
>>
>>> Dear Joel,
>>> I'm just passing an iterable as I would do with any other sequence of
>>> parameters to tune. In this case the list only has one element to use but
>>> in general I ought to be able to pass a collection of vectors.
>>> Anyway, I guess that that issue is not the cause of the problem.
>>>
>>> El 23 jun. 2017 1:04 a. m., "Joel Nothman" <joel.nothman at gmail.com>
>>> escribió:
>>>
>>>> why are you passing [my_sample_weights] rather than just
>>>> my_sample_weights?
>>>>
>>>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170626/acf7ab45/attachment.html>


More information about the scikit-learn mailing list