[scikit-learn] Fwd: sample_weight parameter is not split when used in GridSearchCV
Andreas Mueller
t3kcit at gmail.com
Tue Jun 27 13:43:44 EDT 2017
We could clarify in the documentation that you can grid-search any
(hyper) parameter of a model,
but not parameters to fit?
Only the values returned by get_params() can be tuned.
Only "param_grid" will be searched, not "fit_params". "fit_params" can
contain only a single setting.
On 06/26/2017 03:17 AM, Joel Nothman wrote:
> I don't think we'll be accepting a pull request adding this feature to
> scikit-learn. It is too niche. But you should go ahead and modify the
> search to operate over weightings for your own research. If you feel
> the documentation can be clarified, a pull request there is welcome.
>
> On 26 June 2017 at 16:43, Manuel CASTEJÓN LIMAS <mcasl at unileon.es
> <mailto:mcasl at unileon.es>> wrote:
>
> Yes, I guess most users will be happy without using weights. Some
> will need to use one single vector, but I am currently researching
> a weighting method thus my need of evaluating multiple weight vectors.
>
> I understand that it seems to be a very specific issue with a
> simple workaround, most likely not worthy of any programming
> effort yet as there are more important issues to address.
>
> I guess that adding a note on this behaviour on the documentation
> could be great. If some parameters can be iterated and others are
> not supported knowing it provides a more solid ground to the user
> base.
>
> I'm committed to spend a few hours studying the code. Should I be
> successful I will come again with a pull request.
> I'll cross my fingers :-)
> Best
> Manolo
>
>
>
> El 24 jun. 2017 20:05, "Julio Antonio Soto de Vicente"
> <julio at esbet.es <mailto:julio at esbet.es>> escribió:
>
> Joel is right.
>
> In fact, you usually don't want to tune a lot the sample
> weights: you may leave them default, set them in order to
> balance classes, or fix them according to some business rule.
>
> That said, you can always run a couple of grid searchs
> changing that sample weights and compare results afterwards.
>
> --
> Julio
>
> El 24 jun 2017, a las 15:51, Joel Nothman
> <joel.nothman at gmail.com <mailto:joel.nothman at gmail.com>> escribió:
>
>> yes, trying multiple sample weightings is not supported by
>> grid search directly.
>>
>> On 23 Jun 2017 6:36 pm, "Manuel Castejón Limas"
>> <manuel.castejon at gmail.com
>> <mailto:manuel.castejon at gmail.com>> wrote:
>>
>> Dear Joel,
>>
>> I tried and removed the square brackets and now it works
>> as expected *for a single* sample_weight vector:
>>
>> |validator = GridSearchCV(my_Regressor,
>> param_grid={'number_of_hidden_neurons': range(4, 5),
>> 'epochs': [50], }, fit_params={'sample_weight':
>> my_sample_weights }, n_jobs=1, ) validator.fit(x, y)|
>>
>> The problem now is that I want to try multiple trainings
>> with multiple sample_weight parameters, in the following
>> fashion:
>>
>> |validator = GridSearchCV(my_Regressor,
>> param_grid={'number_of_hidden_neurons': range(4, 5),
>> 'epochs': [50], 'sample_weight': [my_sample_weights,
>> my_sample_weights**2] , }, fit_params={}, n_jobs=1, )
>> validator.fit(x, y)|
>>
>> But unfortunately it produces the same error again:
>>
>> ValueError: Found a sample_weight array with shape
>> (1000,) for an input with shape (666, 1). sample_weight
>> cannot be broadcast.
>>
>> I guess that the issue is that the sample__weight
>> parameter was not thought to be changed during the
>> tuning, was it?
>>
>>
>> Thank you all for your patience and support.
>> Best
>> Manolo
>>
>>
>>
>>
>> 2017-06-23 1:17 GMT+02:00 Manuel CASTEJÓN LIMAS
>> <mcasl at unileon.es <mailto:mcasl at unileon.es>>:
>>
>> Dear Joel,
>> I'm just passing an iterable as I would do with any
>> other sequence of parameters to tune. In this case
>> the list only has one element to use but in general I
>> ought to be able to pass a collection of vectors.
>> Anyway, I guess that that issue is not the cause of
>> the problem.
>>
>> El 23 jun. 2017 1:04 a. m., "Joel Nothman"
>> <joel.nothman at gmail.com
>> <mailto:joel.nothman at gmail.com>> escribió:
>>
>> why are you passing [my_sample_weights] rather
>> than just my_sample_weights?
>>
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org <mailto:scikit-learn at python.org>
>> https://mail.python.org/mailman/listinfo/scikit-learn
>> <https://mail.python.org/mailman/listinfo/scikit-learn>
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org <mailto:scikit-learn at python.org>
>> https://mail.python.org/mailman/listinfo/scikit-learn
>> <https://mail.python.org/mailman/listinfo/scikit-learn>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org <mailto:scikit-learn at python.org>
> https://mail.python.org/mailman/listinfo/scikit-learn
> <https://mail.python.org/mailman/listinfo/scikit-learn>
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org <mailto:scikit-learn at python.org>
> https://mail.python.org/mailman/listinfo/scikit-learn
> <https://mail.python.org/mailman/listinfo/scikit-learn>
>
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170627/20c2b698/attachment-0001.html>
More information about the scikit-learn
mailing list